{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "12e91914-5f51-43fa-b65b-625e73b4d17b",
   "metadata": {
    "id": "12e91914-5f51-43fa-b65b-625e73b4d17b"
   },
   "source": [
    "<table style=\"width:100%\">\n",
    "<tr>\n",
    "<td style=\"vertical-align:middle; text-align:left;\">\n",
    "<font size=\"2\">\n",
    "Supplementary code for the <a href=\"http://mng.bz/orYv\">Build a Large Language Model From Scratch</a> book by <a href=\"https://sebastianraschka.com\">Sebastian Raschka</a><br>\n",
    "<br>Code repository: <a href=\"https://github.com/rasbt/LLMs-from-scratch\">https://github.com/rasbt/LLMs-from-scratch</a>\n",
    "</font>\n",
    "</td>\n",
    "<td style=\"vertical-align:middle; text-align:left;\">\n",
    "<a href=\"http://mng.bz/orYv\"><img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/cover-small.webp?1\" width=\"100px\"></a>\n",
    "</td>\n",
    "</tr>\n",
    "</table>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c2520ec3-722f-4f44-bdd1-885b13e7afbf",
   "metadata": {
    "id": "c2520ec3-722f-4f44-bdd1-885b13e7afbf"
   },
   "source": [
    "# Chapter 7: Finetuning To Follow Instructions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "4e19327b-6c02-4881-ad02-9b6d3ec0b1b4",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "4e19327b-6c02-4881-ad02-9b6d3ec0b1b4",
    "outputId": "bcdfe2cb-d084-4920-d703-503131aabec3"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "numpy version: 2.3.4\n",
      "matplotlib version: 3.10.7\n",
      "tiktoken version: 0.12.0\n",
      "torch version: 2.9.0\n",
      "tqdm version: 4.67.1\n",
      "tensorflow version: 2.20.0\n"
     ]
    }
   ],
   "source": [
    "from importlib.metadata import version\n",
    "\n",
    "pkgs = [\n",
    "    \"numpy\",       # PyTorch & TensorFlow dependency\n",
    "    \"matplotlib\",  # Plotting library\n",
    "    \"tiktoken\",    # Tokenizer\n",
    "    \"torch\",       # Deep learning library\n",
    "    \"tqdm\",        # Progress bar\n",
    "    \"tensorflow\",  # For OpenAI's pretrained weights\n",
    "]\n",
    "for p in pkgs:\n",
    "    print(f\"{p} version: {version(p)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "264fca98-2f9a-4193-b435-2abfa3b4142f",
   "metadata": {
    "id": "264fca98-2f9a-4193-b435-2abfa3b4142f"
   },
   "source": [
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/01.webp\" width=500px>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8bbc68e9-75b3-41f1-ac2c-e071c3cd0813",
   "metadata": {
    "id": "8bbc68e9-75b3-41f1-ac2c-e071c3cd0813"
   },
   "source": [
    "## 7.1 Introduction to instruction finetuning"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "53dba24a-6805-496c-9a7f-c75e2d3527ab",
   "metadata": {
    "id": "53dba24a-6805-496c-9a7f-c75e2d3527ab"
   },
   "source": [
    "- In chapter 5, we saw that pretraining an LLM involves a training procedure where it learns to generate one word at a time\n",
    "- Hence, a pretrained LLM is good at text completion, but it is not good at following instructions\n",
    "- In this chapter, we teach the LLM to follow instructions better"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "18dc0535-0904-44ed-beaf-9b678292ef35",
   "metadata": {
    "id": "18dc0535-0904-44ed-beaf-9b678292ef35"
   },
   "source": [
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/02.webp\" width=500px>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b4698b23-12e0-4bd7-a140-ccb3dd71d4e8",
   "metadata": {
    "id": "b4698b23-12e0-4bd7-a140-ccb3dd71d4e8"
   },
   "source": [
    "- The topics covered in this chapter are summarized in the figure below\n",
    "\n",
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/03.webp\" width=500px>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5384f0cf-ef3c-4436-a5fa-59bd25649f86",
   "metadata": {
    "id": "5384f0cf-ef3c-4436-a5fa-59bd25649f86"
   },
   "source": [
    "## 7.2 Preparing a dataset for supervised instruction finetuning"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f8b34ff8-619f-4e89-bd03-ce513269760d",
   "metadata": {
    "id": "f8b34ff8-619f-4e89-bd03-ce513269760d"
   },
   "source": [
    "- We will work with an instruction dataset I prepared for this chapter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "0G3axLw6kY1N",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "0G3axLw6kY1N",
    "outputId": "07e1e4f9-026c-48c1-8a06-f2bfb1fb354e"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of entries: 1100\n"
     ]
    }
   ],
   "source": [
    "import json\n",
    "import os\n",
    "import requests\n",
    "\n",
    "\n",
    "def download_and_load_file(file_path, url):\n",
    "    if not os.path.exists(file_path):\n",
    "        response = requests.get(url, timeout=30)\n",
    "        response.raise_for_status()\n",
    "        text_data = response.text\n",
    "        with open(file_path, \"w\", encoding=\"utf-8\") as file:\n",
    "            file.write(text_data)\n",
    "\n",
    "    with open(file_path, \"r\", encoding=\"utf-8\") as file:\n",
    "        data = json.load(file)\n",
    "\n",
    "    return data\n",
    "\n",
    "\n",
    "# The book originally used the following code below\n",
    "# However, urllib uses older protocol settings that\n",
    "# can cause problems for some readers using a VPN.\n",
    "# The `requests` version above is more robust\n",
    "# in that regard.\n",
    "\n",
    "\"\"\"\n",
    "import urllib\n",
    "\n",
    "def download_and_load_file(file_path, url):\n",
    "\n",
    "    if not os.path.exists(file_path):\n",
    "        with urllib.request.urlopen(url) as response:\n",
    "            text_data = response.read().decode(\"utf-8\")\n",
    "        with open(file_path, \"w\", encoding=\"utf-8\") as file:\n",
    "            file.write(text_data)\n",
    "\n",
    "    else:\n",
    "        with open(file_path, \"r\", encoding=\"utf-8\") as file:\n",
    "            text_data = file.read()\n",
    "\n",
    "    with open(file_path, \"r\", encoding=\"utf-8\") as file:\n",
    "        data = json.load(file)\n",
    "\n",
    "    return data\n",
    "\"\"\"\n",
    "\n",
    "\n",
    "file_path = \"instruction-data.json\"\n",
    "url = (\n",
    "    \"https://raw.githubusercontent.com/rasbt/LLMs-from-scratch\"\n",
    "    \"/main/ch07/01_main-chapter-code/instruction-data.json\"\n",
    ")\n",
    "\n",
    "data = download_and_load_file(file_path, url)\n",
    "print(\"Number of entries:\", len(data))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d7af8176-4255-4e92-8c7d-998771733eb8",
   "metadata": {
    "id": "d7af8176-4255-4e92-8c7d-998771733eb8"
   },
   "source": [
    "- Each item in the `data` list we loaded from the JSON file above is a dictionary in the following form"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "-LiuBMsHkzQV",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "-LiuBMsHkzQV",
    "outputId": "a4ee5c2d-db53-4a80-e5ee-0bbcf6fe0450"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Example entry:\n",
      " {'instruction': 'Identify the correct spelling of the following word.', 'input': 'Ocassion', 'output': \"The correct spelling is 'Occasion.'\"}\n"
     ]
    }
   ],
   "source": [
    "print(\"Example entry:\\n\", data[50])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c5a32b34-485a-4816-a77a-da14f9fe6e46",
   "metadata": {
    "id": "c5a32b34-485a-4816-a77a-da14f9fe6e46"
   },
   "source": [
    "- Note that the `'input'` field can be empty:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "uFInFxDDk2Je",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "uFInFxDDk2Je",
    "outputId": "b4f84027-bb9e-4e51-b79e-1329c8bff093"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Another example entry:\n",
      " {'instruction': \"What is an antonym of 'complicated'?\", 'input': '', 'output': \"An antonym of 'complicated' is 'simple'.\"}\n"
     ]
    }
   ],
   "source": [
    "print(\"Another example entry:\\n\", data[999])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f034799a-6575-45fd-98c9-9d1012d0fd58",
   "metadata": {
    "id": "f034799a-6575-45fd-98c9-9d1012d0fd58"
   },
   "source": [
    "- Instruction finetuning is often referred to as \"supervised instruction finetuning\" because it involves training a model on a dataset where the input-output pairs are explicitly provided\n",
    "- There are different ways to format the entries as inputs to the LLM; the figure below illustrates two example formats that were used for training the Alpaca (https://crfm.stanford.edu/2023/03/13/alpaca.html) and Phi-3 (https://arxiv.org/abs/2404.14219) LLMs, respectively"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dffa4f70-44d4-4be4-89a9-2159f4885b10",
   "metadata": {
    "id": "dffa4f70-44d4-4be4-89a9-2159f4885b10"
   },
   "source": [
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/04.webp?2\" width=500px>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dd79a74e-befb-491c-be49-f777a6a5b6a6",
   "metadata": {
    "id": "dd79a74e-befb-491c-be49-f777a6a5b6a6"
   },
   "source": [
    "- In this chapter, we use Alpaca-style prompt formatting, which was the original prompt template for instruction finetuning\n",
    "- Below, we format the input that we will pass as input to the LLM"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "Jhk37nnJnkBh",
   "metadata": {
    "id": "Jhk37nnJnkBh"
   },
   "outputs": [],
   "source": [
    "def format_input(entry):\n",
    "    instruction_text = (\n",
    "        f\"Below is an instruction that describes a task. \"\n",
    "        f\"Write a response that appropriately completes the request.\"\n",
    "        f\"\\n\\n### Instruction:\\n{entry['instruction']}\"\n",
    "    )\n",
    "\n",
    "    input_text = f\"\\n\\n### Input:\\n{entry['input']}\" if entry[\"input\"] else \"\"\n",
    "\n",
    "    return instruction_text + input_text"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "011e78b4-e89a-4653-a2ee-7b2739ca04d6",
   "metadata": {
    "id": "011e78b4-e89a-4653-a2ee-7b2739ca04d6"
   },
   "source": [
    "- A formatted response with input field looks like as shown below"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "F9UQRfjzo4Js",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "F9UQRfjzo4Js",
    "outputId": "7b615d35-2a5f-474d-9292-a69bc3850e16"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
      "\n",
      "### Instruction:\n",
      "Identify the correct spelling of the following word.\n",
      "\n",
      "### Input:\n",
      "Ocassion\n",
      "\n",
      "### Response:\n",
      "The correct spelling is 'Occasion.'\n"
     ]
    }
   ],
   "source": [
    "model_input = format_input(data[50])\n",
    "desired_response = f\"\\n\\n### Response:\\n{data[50]['output']}\"\n",
    "\n",
    "print(model_input + desired_response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4dc93ddf-431c-49c0-96f2-fb3a79c4d94c",
   "metadata": {
    "id": "4dc93ddf-431c-49c0-96f2-fb3a79c4d94c"
   },
   "source": [
    "- Below is a formatted response without an input field"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "a3891fa9-f738-41cd-946c-80ef9a99c346",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "a3891fa9-f738-41cd-946c-80ef9a99c346",
    "outputId": "2142c5a4-b594-49c5-affe-2d963a7bd46b"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
      "\n",
      "### Instruction:\n",
      "What is an antonym of 'complicated'?\n",
      "\n",
      "### Response:\n",
      "An antonym of 'complicated' is 'simple'.\n"
     ]
    }
   ],
   "source": [
    "model_input = format_input(data[999])\n",
    "desired_response = f\"\\n\\n### Response:\\n{data[999]['output']}\"\n",
    "\n",
    "print(model_input + desired_response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4aa8afd5-2a21-49a5-90c3-6a03865a4771",
   "metadata": {
    "id": "4aa8afd5-2a21-49a5-90c3-6a03865a4771"
   },
   "source": [
    "- Lastly, before we prepare the PyTorch data loaders in the next section, we divide the dataset into a training, validation, and test set"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "aFZVopbIlNfx",
   "metadata": {
    "id": "aFZVopbIlNfx"
   },
   "outputs": [],
   "source": [
    "train_portion = int(len(data) * 0.85)  # 85% for training\n",
    "test_portion = int(len(data) * 0.1)    # 10% for testing\n",
    "val_portion = len(data) - train_portion - test_portion  # Remaining 5% for validation\n",
    "\n",
    "train_data = data[:train_portion]\n",
    "test_data = data[train_portion:train_portion + test_portion]\n",
    "val_data = data[train_portion + test_portion:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "-zf6oht6bIUQ",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "-zf6oht6bIUQ",
    "outputId": "657ec5c6-4caa-4d1a-ba2e-23acd755ab07"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training set length: 935\n",
      "Validation set length: 55\n",
      "Test set length: 110\n"
     ]
    }
   ],
   "source": [
    "print(\"Training set length:\", len(train_data))\n",
    "print(\"Validation set length:\", len(val_data))\n",
    "print(\"Test set length:\", len(test_data))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fcaaf606-f913-4445-8301-632ae10d387d",
   "metadata": {
    "id": "fcaaf606-f913-4445-8301-632ae10d387d"
   },
   "source": [
    "## 7.3 Organizing data into training batches"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "233f63bd-9755-4d07-8884-5e2e5345cf27",
   "metadata": {
    "id": "233f63bd-9755-4d07-8884-5e2e5345cf27"
   },
   "source": [
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/05.webp?1\" width=500px>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c149fc1a-7757-4ec8-80cb-e2a3fb007a2c",
   "metadata": {
    "id": "c149fc1a-7757-4ec8-80cb-e2a3fb007a2c"
   },
   "source": [
    "- We tackle this dataset batching in several steps, as summarized in the figure below\n",
    "\n",
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/06.webp?1\" width=500px>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b9af423f-aad9-4b3c-bea5-153021c04862",
   "metadata": {
    "id": "b9af423f-aad9-4b3c-bea5-153021c04862"
   },
   "source": [
    "- First, we implement an `InstructionDataset` class that pre-tokenizes all inputs in the dataset, similar to the `SpamDataset` in chapter 6\n",
    "\n",
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/07.webp?1\" width=500px>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "adc29dc4-f1c7-4c71-937b-95119d6239bb",
   "metadata": {
    "id": "adc29dc4-f1c7-4c71-937b-95119d6239bb"
   },
   "outputs": [],
   "source": [
    "import torch\n",
    "from torch.utils.data import Dataset\n",
    "\n",
    "\n",
    "class InstructionDataset(Dataset):\n",
    "    def __init__(self, data, tokenizer):\n",
    "        self.data = data\n",
    "\n",
    "        # Pre-tokenize texts\n",
    "        self.encoded_texts = []\n",
    "        for entry in data:\n",
    "            instruction_plus_input = format_input(entry)\n",
    "            response_text = f\"\\n\\n### Response:\\n{entry['output']}\"\n",
    "            full_text = instruction_plus_input + response_text\n",
    "            self.encoded_texts.append(\n",
    "                tokenizer.encode(full_text)\n",
    "            )\n",
    "\n",
    "    def __getitem__(self, index):\n",
    "        return self.encoded_texts[index]\n",
    "\n",
    "    def __len__(self):\n",
    "        return len(self.data)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "384f0e69-4b22-41c0-a25d-f077527eddd1",
   "metadata": {
    "id": "384f0e69-4b22-41c0-a25d-f077527eddd1"
   },
   "source": [
    "- Similar to chapter 6, we want to collect multiple training examples in a batch to accelerate training; this requires padding all inputs to a similar length\n",
    "- Also similar to the previous chapter, we use the `<|endoftext|>` token as a padding token"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "ff24fe1a-5746-461c-ad3d-b6d84a1a7c96",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "ff24fe1a-5746-461c-ad3d-b6d84a1a7c96",
    "outputId": "ac44227b-9ec2-4131-9df8-89caa6e879ca"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[50256]\n"
     ]
    }
   ],
   "source": [
    "import tiktoken\n",
    "tokenizer = tiktoken.get_encoding(\"gpt2\")\n",
    "\n",
    "print(tokenizer.encode(\"<|endoftext|>\", allowed_special={\"<|endoftext|>\"}))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9e5bd7bc-f347-4cf8-a0c2-94cb8799e427",
   "metadata": {
    "id": "9e5bd7bc-f347-4cf8-a0c2-94cb8799e427"
   },
   "source": [
    "- In chapter 6, we padded all examples in a dataset to the same length\n",
    "  - Here, we take a more sophisticated approach and develop a custom \"collate\" function that we can pass to the data loader\n",
    "  - This custom collate function pads the training examples in each batch to have the same length (but different batches can have different lengths)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "65c4d943-4aa8-4a44-874e-05bc6831fbd3",
   "metadata": {
    "id": "65c4d943-4aa8-4a44-874e-05bc6831fbd3"
   },
   "source": [
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/08.webp?1\" width=500px>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "eb4c77dd-c956-4a1b-897b-b466909f18ca",
   "metadata": {
    "id": "eb4c77dd-c956-4a1b-897b-b466909f18ca"
   },
   "outputs": [],
   "source": [
    "def custom_collate_draft_1(\n",
    "    batch,\n",
    "    pad_token_id=50256,\n",
    "    device=\"cpu\"\n",
    "):\n",
    "    # Find the longest sequence in the batch\n",
    "    # and increase the max length by +1, which will add one extra\n",
    "    # padding token below\n",
    "    batch_max_length = max(len(item)+1 for item in batch)\n",
    "\n",
    "    # Pad and prepare inputs\n",
    "    inputs_lst = []\n",
    "\n",
    "    for item in batch:\n",
    "        new_item = item.copy()\n",
    "        # Add an <|endoftext|> token\n",
    "        new_item += [pad_token_id]\n",
    "        # Pad sequences to batch_max_length\n",
    "        padded = (\n",
    "            new_item + [pad_token_id] *\n",
    "            (batch_max_length - len(new_item))\n",
    "        )\n",
    "        # Via padded[:-1], we remove the extra padded token\n",
    "        # that has been added via the +1 setting in batch_max_length\n",
    "        # (the extra padding token will be relevant in later codes)\n",
    "        inputs = torch.tensor(padded[:-1])\n",
    "        inputs_lst.append(inputs)\n",
    "\n",
    "    # Convert list of inputs to tensor and transfer to target device\n",
    "    inputs_tensor = torch.stack(inputs_lst).to(device)\n",
    "    return inputs_tensor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "8fb02373-59b3-4f3a-b1d1-8181a2432645",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "8fb02373-59b3-4f3a-b1d1-8181a2432645",
    "outputId": "93d987b9-e3ca-4857-9b28-b67d515a94d8"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[    0,     1,     2,     3,     4],\n",
      "        [    5,     6, 50256, 50256, 50256],\n",
      "        [    7,     8,     9, 50256, 50256]])\n"
     ]
    }
   ],
   "source": [
    "inputs_1 = [0, 1, 2, 3, 4]\n",
    "inputs_2 = [5, 6]\n",
    "inputs_3 = [7, 8, 9]\n",
    "\n",
    "batch = (\n",
    "    inputs_1,\n",
    "    inputs_2,\n",
    "    inputs_3\n",
    ")\n",
    "\n",
    "print(custom_collate_draft_1(batch))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5673ade5-be4c-4a2c-9a9a-d5c63fb1c424",
   "metadata": {},
   "source": [
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/09.webp?1\" width=400px>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "17769a19-b961-4213-92ef-34f441b2d1d6",
   "metadata": {
    "id": "17769a19-b961-4213-92ef-34f441b2d1d6"
   },
   "source": [
    "- Above, we only returned the inputs to the LLM; however, for LLM training, we also need the target values\n",
    "- Similar to pretraining an LLM, the targets are the inputs shifted by 1 position to the right, so the LLM learns to predict the next token"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0386b6fe-3455-4e70-becd-a5a4681ba2ef",
   "metadata": {
    "id": "0386b6fe-3455-4e70-becd-a5a4681ba2ef"
   },
   "source": [
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/10.webp?1\" width=400px>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "74af192e-757c-4c0a-bdf9-b7eb25bf6ebc",
   "metadata": {
    "id": "74af192e-757c-4c0a-bdf9-b7eb25bf6ebc"
   },
   "outputs": [],
   "source": [
    "def custom_collate_draft_2(\n",
    "    batch,\n",
    "    pad_token_id=50256,\n",
    "    device=\"cpu\"\n",
    "):\n",
    "    # Find the longest sequence in the batch\n",
    "    batch_max_length = max(len(item)+1 for item in batch)\n",
    "\n",
    "    # Pad and prepare inputs\n",
    "    inputs_lst, targets_lst = [], []\n",
    "\n",
    "    for item in batch:\n",
    "        new_item = item.copy()\n",
    "        # Add an <|endoftext|> token\n",
    "        new_item += [pad_token_id]\n",
    "        # Pad sequences to max_length\n",
    "        padded = (\n",
    "            new_item + [pad_token_id] *\n",
    "            (batch_max_length - len(new_item))\n",
    "        )\n",
    "        inputs = torch.tensor(padded[:-1])  # Truncate the last token for inputs\n",
    "        targets = torch.tensor(padded[1:])  # Shift +1 to the right for targets\n",
    "        inputs_lst.append(inputs)\n",
    "        targets_lst.append(targets)\n",
    "\n",
    "    # Convert list of inputs to tensor and transfer to target device\n",
    "    inputs_tensor = torch.stack(inputs_lst).to(device)\n",
    "    targets_tensor = torch.stack(targets_lst).to(device)\n",
    "    return inputs_tensor, targets_tensor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "6eb2bce3-28a7-4f39-9d4b-5e972d69066c",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "6eb2bce3-28a7-4f39-9d4b-5e972d69066c",
    "outputId": "3d104439-c328-431b-ef7c-2639d86c2135"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[    0,     1,     2,     3,     4],\n",
      "        [    5,     6, 50256, 50256, 50256],\n",
      "        [    7,     8,     9, 50256, 50256]])\n",
      "tensor([[    1,     2,     3,     4, 50256],\n",
      "        [    6, 50256, 50256, 50256, 50256],\n",
      "        [    8,     9, 50256, 50256, 50256]])\n"
     ]
    }
   ],
   "source": [
    "inputs, targets = custom_collate_draft_2(batch)\n",
    "print(inputs)\n",
    "print(targets)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3bf85703-a0e0-42aa-8f29-cbc28dbf4e15",
   "metadata": {
    "id": "3bf85703-a0e0-42aa-8f29-cbc28dbf4e15"
   },
   "source": [
    "- Next, we introduce an `ignore_index` value to replace all padding token IDs with a new value; the purpose of this `ignore_index` is that we can ignore padding values in the loss function (more on that later)\n",
    "\n",
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/11.webp?1\" width=400px>\n",
    "\n",
    "- Concretely, this means that we replace the token IDs corresponding to `50256` with `-100` as illustrated below"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bd4bed33-956e-4b3f-a09c-586d8203109a",
   "metadata": {
    "id": "bd4bed33-956e-4b3f-a09c-586d8203109a"
   },
   "source": [
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/12.webp?2\" width=500px>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5346513e-c3f4-44fe-af22-4ebd36497728",
   "metadata": {
    "id": "5346513e-c3f4-44fe-af22-4ebd36497728"
   },
   "source": [
    "- (In addition, we also introduce the `allowed_max_length` in case we want to limit the length of the samples; this will be useful if you plan to work with your own datasets that are longer than the 1024 token context size supported by the GPT-2 model)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "41ec6e2d-9eb2-4124-913e-d2af39be4cf2",
   "metadata": {
    "id": "41ec6e2d-9eb2-4124-913e-d2af39be4cf2"
   },
   "outputs": [],
   "source": [
    "def custom_collate_fn(\n",
    "    batch,\n",
    "    pad_token_id=50256,\n",
    "    ignore_index=-100,\n",
    "    allowed_max_length=None,\n",
    "    device=\"cpu\"\n",
    "):\n",
    "    # Find the longest sequence in the batch\n",
    "    batch_max_length = max(len(item)+1 for item in batch)\n",
    "\n",
    "    # Pad and prepare inputs and targets\n",
    "    inputs_lst, targets_lst = [], []\n",
    "\n",
    "    for item in batch:\n",
    "        new_item = item.copy()\n",
    "        # Add an <|endoftext|> token\n",
    "        new_item += [pad_token_id]\n",
    "        # Pad sequences to max_length\n",
    "        padded = (\n",
    "            new_item + [pad_token_id] *\n",
    "            (batch_max_length - len(new_item))\n",
    "        )\n",
    "        inputs = torch.tensor(padded[:-1])  # Truncate the last token for inputs\n",
    "        targets = torch.tensor(padded[1:])  # Shift +1 to the right for targets\n",
    "\n",
    "        # New: Replace all but the first padding tokens in targets by ignore_index\n",
    "        mask = targets == pad_token_id\n",
    "        indices = torch.nonzero(mask).squeeze()\n",
    "        if indices.numel() > 1:\n",
    "            targets[indices[1:]] = ignore_index\n",
    "\n",
    "        # New: Optionally truncate to maximum sequence length\n",
    "        if allowed_max_length is not None:\n",
    "            inputs = inputs[:allowed_max_length]\n",
    "            targets = targets[:allowed_max_length]\n",
    "\n",
    "        inputs_lst.append(inputs)\n",
    "        targets_lst.append(targets)\n",
    "\n",
    "    # Convert list of inputs and targets to tensors and transfer to target device\n",
    "    inputs_tensor = torch.stack(inputs_lst).to(device)\n",
    "    targets_tensor = torch.stack(targets_lst).to(device)\n",
    "\n",
    "    return inputs_tensor, targets_tensor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "cdf5eec4-9ebe-4be0-9fca-9a47bee88fdc",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "cdf5eec4-9ebe-4be0-9fca-9a47bee88fdc",
    "outputId": "e8f709b9-f4c5-428a-a6ac-2a4c1b9358ba"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[    0,     1,     2,     3,     4],\n",
      "        [    5,     6, 50256, 50256, 50256],\n",
      "        [    7,     8,     9, 50256, 50256]])\n",
      "tensor([[    1,     2,     3,     4, 50256],\n",
      "        [    6, 50256,  -100,  -100,  -100],\n",
      "        [    8,     9, 50256,  -100,  -100]])\n"
     ]
    }
   ],
   "source": [
    "inputs, targets = custom_collate_fn(batch)\n",
    "print(inputs)\n",
    "print(targets)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "26727c90-0d42-43b3-af21-0a66ad4fbbc7",
   "metadata": {
    "id": "26727c90-0d42-43b3-af21-0a66ad4fbbc7"
   },
   "source": [
    "- Let's see what this replacement by -100 accomplishes\n",
    "- For illustration purposes, let's assume we have a small classification task with 2 class labels, 0 and 1, similar to chapter 6\n",
    "- If we have the following logits values (outputs of the last layer of the model), we calculate the following loss"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "W2jvh-OP9MFV",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "W2jvh-OP9MFV",
    "outputId": "ccb3a703-59a7-4258-8841-57959a016e31"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor(1.1269)\n"
     ]
    }
   ],
   "source": [
    "logits_1 = torch.tensor(\n",
    "    [[-1.0, 1.0],  # 1st training example\n",
    "     [-0.5, 1.5]]  # 2nd training example\n",
    ")\n",
    "targets_1 = torch.tensor([0, 1])\n",
    "\n",
    "\n",
    "loss_1 = torch.nn.functional.cross_entropy(logits_1, targets_1)\n",
    "print(loss_1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5edd3244-8886-4505-92e9-367d28529e1e",
   "metadata": {
    "id": "5edd3244-8886-4505-92e9-367d28529e1e"
   },
   "source": [
    "- Now, adding one more training example will, as expected, influence the loss"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "nvVMuil89v9N",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "nvVMuil89v9N",
    "outputId": "6d4683d4-5bfc-4a8c-de2a-95ecb2e716b9"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor(0.7936)\n"
     ]
    }
   ],
   "source": [
    "logits_2 = torch.tensor(\n",
    "    [[-1.0, 1.0],\n",
    "     [-0.5, 1.5],\n",
    "     [-0.5, 1.5]]  # New 3rd training example\n",
    ")\n",
    "targets_2 = torch.tensor([0, 1, 1])\n",
    "\n",
    "loss_2 = torch.nn.functional.cross_entropy(logits_2, targets_2)\n",
    "print(loss_2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "54dca331-40e0-468b-b690-189fe156ba8f",
   "metadata": {
    "id": "54dca331-40e0-468b-b690-189fe156ba8f"
   },
   "source": [
    "- Let's see what happens if we replace the class label of one of the examples with -100"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "RTyB1vah9p56",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "RTyB1vah9p56",
    "outputId": "da05302e-3fe0-439e-d1ed-82066bceb122"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor(1.1269)\n",
      "loss_1 == loss_3: tensor(True)\n"
     ]
    }
   ],
   "source": [
    "targets_3 = torch.tensor([0, 1, -100])\n",
    "\n",
    "loss_3 = torch.nn.functional.cross_entropy(logits_2, targets_3)\n",
    "print(loss_3)\n",
    "print(\"loss_1 == loss_3:\", loss_1 == loss_3)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cef09d21-b652-4760-abea-4f76920e6a25",
   "metadata": {
    "id": "cef09d21-b652-4760-abea-4f76920e6a25"
   },
   "source": [
    "- As we can see, the resulting loss on these 3 training examples is the same as the loss we calculated from the 2 training examples, which means that the cross-entropy loss function ignored the training example with the -100 label\n",
    "- By default, PyTorch has the `cross_entropy(..., ignore_index=-100)` setting to ignore examples corresponding to the label -100\n",
    "- Using this -100 `ignore_index`, we can ignore the additional end-of-text (padding) tokens in the batches that we used to pad the training examples to equal length\n",
    "- However, we don't want to ignore the first instance of the end-of-text (padding) token (50256) because it can help signal to the LLM when the response is complete"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6a4e9c5f-7c49-4321-9f1b-a50468a84524",
   "metadata": {
    "id": "6a4e9c5f-7c49-4321-9f1b-a50468a84524"
   },
   "source": [
    "- In practice, it is also common to mask out the target token IDs that correspond to the instruction, as illustrated in the figure below (this is a recommended reader exercise after completing the chapter)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fab8f0ed-80e8-4fd9-bf84-e5d0e0bc0a39",
   "metadata": {
    "id": "fab8f0ed-80e8-4fd9-bf84-e5d0e0bc0a39"
   },
   "source": [
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/13.webp\" width=600px>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bccaf048-ec95-498c-9155-d5b3ccba6c96",
   "metadata": {
    "id": "bccaf048-ec95-498c-9155-d5b3ccba6c96"
   },
   "source": [
    "&nbsp;\n",
    "## 7.4 Creating data loaders for an instruction dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e6b8e656-3af3-4db6-8dde-d8c216a12f50",
   "metadata": {
    "id": "e6b8e656-3af3-4db6-8dde-d8c216a12f50"
   },
   "source": [
    "- In this section, we use the `InstructionDataset` class and `custom_collate_fn` function to instantiate the training, validation, and test data loaders"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9fffe390-b226-4d5c-983f-9f4da773cb82",
   "metadata": {
    "id": "9fffe390-b226-4d5c-983f-9f4da773cb82"
   },
   "source": [
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/14.webp\" width=500px>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "932677e9-9317-42e8-b461-7b0269518f97",
   "metadata": {
    "id": "932677e9-9317-42e8-b461-7b0269518f97"
   },
   "source": [
    "- Another additional detail of the previous `custom_collate_fn` function is that we now directly move the data to the target device (e.g., GPU) instead of doing it in the main training loop, which improves efficiency because it can be carried out as a background process when we use the `custom_collate_fn` as part of the data loader\n",
    "- Using the `partial` function from Python's `functools` standard library, we create a new function with the `device` argument of the original function pre-filled"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "etpqqWh8phKc",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "etpqqWh8phKc",
    "outputId": "b4391c33-1a89-455b-faaa-5f874b6eb409"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Device: mps\n"
     ]
    }
   ],
   "source": [
    "if torch.cuda.is_available():\n",
    "    device = torch.device(\"cuda\")\n",
    "elif torch.backends.mps.is_available():\n",
    "    # Use PyTorch 2.9 or newer for stable mps results\n",
    "    major, minor = map(int, torch.__version__.split(\".\")[:2])\n",
    "    if (major, minor) >= (2, 9):\n",
    "        device = torch.device(\"mps\")\n",
    "    else:\n",
    "        device = torch.device(\"cpu\")\n",
    "else:\n",
    "    device = torch.device(\"cpu\")\n",
    "\n",
    "print(\"Device:\", device)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "4e47fb30-c2c6-4e6d-a64c-76cc65be4a2c",
   "metadata": {
    "id": "4e47fb30-c2c6-4e6d-a64c-76cc65be4a2c"
   },
   "outputs": [],
   "source": [
    "from functools import partial\n",
    "\n",
    "customized_collate_fn = partial(\n",
    "    custom_collate_fn,\n",
    "    device=device,\n",
    "    allowed_max_length=1024\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8ff42c29-8b81-45e5-ae8d-b97cd1cf447a",
   "metadata": {
    "id": "8ff42c29-8b81-45e5-ae8d-b97cd1cf447a"
   },
   "source": [
    "- Next, we instantiate the data loaders similar to previous chapters, except that we now provide our own collate function for the batching process"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "BtWkgir6Hlpe",
   "metadata": {
    "id": "BtWkgir6Hlpe"
   },
   "outputs": [],
   "source": [
    "from torch.utils.data import DataLoader\n",
    "\n",
    "\n",
    "num_workers = 0\n",
    "batch_size = 8\n",
    "\n",
    "torch.manual_seed(123)\n",
    "\n",
    "train_dataset = InstructionDataset(train_data, tokenizer)\n",
    "train_loader = DataLoader(\n",
    "    train_dataset,\n",
    "    batch_size=batch_size,\n",
    "    collate_fn=customized_collate_fn,\n",
    "    shuffle=True,\n",
    "    drop_last=True,\n",
    "    num_workers=num_workers\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "1d097dc8-ad34-4f05-b435-e4147965f532",
   "metadata": {
    "id": "1d097dc8-ad34-4f05-b435-e4147965f532"
   },
   "outputs": [],
   "source": [
    "val_dataset = InstructionDataset(val_data, tokenizer)\n",
    "val_loader = DataLoader(\n",
    "    val_dataset,\n",
    "    batch_size=batch_size,\n",
    "    collate_fn=customized_collate_fn,\n",
    "    shuffle=False,\n",
    "    drop_last=False,\n",
    "    num_workers=num_workers\n",
    ")\n",
    "\n",
    "test_dataset = InstructionDataset(test_data, tokenizer)\n",
    "test_loader = DataLoader(\n",
    "    test_dataset,\n",
    "    batch_size=batch_size,\n",
    "    collate_fn=customized_collate_fn,\n",
    "    shuffle=False,\n",
    "    drop_last=False,\n",
    "    num_workers=num_workers\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3f67c147-b1a2-4a95-9807-e2d0de0324c0",
   "metadata": {
    "id": "3f67c147-b1a2-4a95-9807-e2d0de0324c0"
   },
   "source": [
    "- Let's see what the dimensions of the resulting input and target batches look like"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "GGs1AI3vHpnX",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "GGs1AI3vHpnX",
    "outputId": "f6a74c8b-1af3-4bc1-b48c-eda64b0200d1"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Train loader:\n",
      "torch.Size([8, 61]) torch.Size([8, 61])\n",
      "torch.Size([8, 76]) torch.Size([8, 76])\n",
      "torch.Size([8, 73]) torch.Size([8, 73])\n",
      "torch.Size([8, 68]) torch.Size([8, 68])\n",
      "torch.Size([8, 65]) torch.Size([8, 65])\n",
      "torch.Size([8, 72]) torch.Size([8, 72])\n",
      "torch.Size([8, 80]) torch.Size([8, 80])\n",
      "torch.Size([8, 67]) torch.Size([8, 67])\n",
      "torch.Size([8, 62]) torch.Size([8, 62])\n",
      "torch.Size([8, 75]) torch.Size([8, 75])\n",
      "torch.Size([8, 62]) torch.Size([8, 62])\n",
      "torch.Size([8, 68]) torch.Size([8, 68])\n",
      "torch.Size([8, 67]) torch.Size([8, 67])\n",
      "torch.Size([8, 77]) torch.Size([8, 77])\n",
      "torch.Size([8, 69]) torch.Size([8, 69])\n",
      "torch.Size([8, 79]) torch.Size([8, 79])\n",
      "torch.Size([8, 71]) torch.Size([8, 71])\n",
      "torch.Size([8, 66]) torch.Size([8, 66])\n",
      "torch.Size([8, 83]) torch.Size([8, 83])\n",
      "torch.Size([8, 68]) torch.Size([8, 68])\n",
      "torch.Size([8, 80]) torch.Size([8, 80])\n",
      "torch.Size([8, 71]) torch.Size([8, 71])\n",
      "torch.Size([8, 69]) torch.Size([8, 69])\n",
      "torch.Size([8, 65]) torch.Size([8, 65])\n",
      "torch.Size([8, 68]) torch.Size([8, 68])\n",
      "torch.Size([8, 60]) torch.Size([8, 60])\n",
      "torch.Size([8, 59]) torch.Size([8, 59])\n",
      "torch.Size([8, 69]) torch.Size([8, 69])\n",
      "torch.Size([8, 63]) torch.Size([8, 63])\n",
      "torch.Size([8, 65]) torch.Size([8, 65])\n",
      "torch.Size([8, 76]) torch.Size([8, 76])\n",
      "torch.Size([8, 66]) torch.Size([8, 66])\n",
      "torch.Size([8, 71]) torch.Size([8, 71])\n",
      "torch.Size([8, 91]) torch.Size([8, 91])\n",
      "torch.Size([8, 65]) torch.Size([8, 65])\n",
      "torch.Size([8, 64]) torch.Size([8, 64])\n",
      "torch.Size([8, 67]) torch.Size([8, 67])\n",
      "torch.Size([8, 66]) torch.Size([8, 66])\n",
      "torch.Size([8, 64]) torch.Size([8, 64])\n",
      "torch.Size([8, 65]) torch.Size([8, 65])\n",
      "torch.Size([8, 75]) torch.Size([8, 75])\n",
      "torch.Size([8, 89]) torch.Size([8, 89])\n",
      "torch.Size([8, 59]) torch.Size([8, 59])\n",
      "torch.Size([8, 88]) torch.Size([8, 88])\n",
      "torch.Size([8, 83]) torch.Size([8, 83])\n",
      "torch.Size([8, 83]) torch.Size([8, 83])\n",
      "torch.Size([8, 70]) torch.Size([8, 70])\n",
      "torch.Size([8, 65]) torch.Size([8, 65])\n",
      "torch.Size([8, 74]) torch.Size([8, 74])\n",
      "torch.Size([8, 76]) torch.Size([8, 76])\n",
      "torch.Size([8, 67]) torch.Size([8, 67])\n",
      "torch.Size([8, 75]) torch.Size([8, 75])\n",
      "torch.Size([8, 83]) torch.Size([8, 83])\n",
      "torch.Size([8, 69]) torch.Size([8, 69])\n",
      "torch.Size([8, 67]) torch.Size([8, 67])\n",
      "torch.Size([8, 60]) torch.Size([8, 60])\n",
      "torch.Size([8, 60]) torch.Size([8, 60])\n",
      "torch.Size([8, 66]) torch.Size([8, 66])\n",
      "torch.Size([8, 80]) torch.Size([8, 80])\n",
      "torch.Size([8, 71]) torch.Size([8, 71])\n",
      "torch.Size([8, 61]) torch.Size([8, 61])\n",
      "torch.Size([8, 58]) torch.Size([8, 58])\n",
      "torch.Size([8, 71]) torch.Size([8, 71])\n",
      "torch.Size([8, 67]) torch.Size([8, 67])\n",
      "torch.Size([8, 68]) torch.Size([8, 68])\n",
      "torch.Size([8, 63]) torch.Size([8, 63])\n",
      "torch.Size([8, 87]) torch.Size([8, 87])\n",
      "torch.Size([8, 68]) torch.Size([8, 68])\n",
      "torch.Size([8, 64]) torch.Size([8, 64])\n",
      "torch.Size([8, 68]) torch.Size([8, 68])\n",
      "torch.Size([8, 71]) torch.Size([8, 71])\n",
      "torch.Size([8, 68]) torch.Size([8, 68])\n",
      "torch.Size([8, 71]) torch.Size([8, 71])\n",
      "torch.Size([8, 61]) torch.Size([8, 61])\n",
      "torch.Size([8, 65]) torch.Size([8, 65])\n",
      "torch.Size([8, 67]) torch.Size([8, 67])\n",
      "torch.Size([8, 65]) torch.Size([8, 65])\n",
      "torch.Size([8, 64]) torch.Size([8, 64])\n",
      "torch.Size([8, 60]) torch.Size([8, 60])\n",
      "torch.Size([8, 72]) torch.Size([8, 72])\n",
      "torch.Size([8, 64]) torch.Size([8, 64])\n",
      "torch.Size([8, 70]) torch.Size([8, 70])\n",
      "torch.Size([8, 57]) torch.Size([8, 57])\n",
      "torch.Size([8, 72]) torch.Size([8, 72])\n",
      "torch.Size([8, 64]) torch.Size([8, 64])\n",
      "torch.Size([8, 68]) torch.Size([8, 68])\n",
      "torch.Size([8, 62]) torch.Size([8, 62])\n",
      "torch.Size([8, 74]) torch.Size([8, 74])\n",
      "torch.Size([8, 80]) torch.Size([8, 80])\n",
      "torch.Size([8, 68]) torch.Size([8, 68])\n",
      "torch.Size([8, 70]) torch.Size([8, 70])\n",
      "torch.Size([8, 91]) torch.Size([8, 91])\n",
      "torch.Size([8, 61]) torch.Size([8, 61])\n",
      "torch.Size([8, 66]) torch.Size([8, 66])\n",
      "torch.Size([8, 80]) torch.Size([8, 80])\n",
      "torch.Size([8, 81]) torch.Size([8, 81])\n",
      "torch.Size([8, 74]) torch.Size([8, 74])\n",
      "torch.Size([8, 82]) torch.Size([8, 82])\n",
      "torch.Size([8, 63]) torch.Size([8, 63])\n",
      "torch.Size([8, 83]) torch.Size([8, 83])\n",
      "torch.Size([8, 68]) torch.Size([8, 68])\n",
      "torch.Size([8, 67]) torch.Size([8, 67])\n",
      "torch.Size([8, 77]) torch.Size([8, 77])\n",
      "torch.Size([8, 91]) torch.Size([8, 91])\n",
      "torch.Size([8, 64]) torch.Size([8, 64])\n",
      "torch.Size([8, 61]) torch.Size([8, 61])\n",
      "torch.Size([8, 75]) torch.Size([8, 75])\n",
      "torch.Size([8, 64]) torch.Size([8, 64])\n",
      "torch.Size([8, 66]) torch.Size([8, 66])\n",
      "torch.Size([8, 78]) torch.Size([8, 78])\n",
      "torch.Size([8, 66]) torch.Size([8, 66])\n",
      "torch.Size([8, 64]) torch.Size([8, 64])\n",
      "torch.Size([8, 83]) torch.Size([8, 83])\n",
      "torch.Size([8, 66]) torch.Size([8, 66])\n",
      "torch.Size([8, 74]) torch.Size([8, 74])\n",
      "torch.Size([8, 69]) torch.Size([8, 69])\n"
     ]
    }
   ],
   "source": [
    "print(\"Train loader:\")\n",
    "for inputs, targets in train_loader:\n",
    "    print(inputs.shape, targets.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0c8e8dd7-d46a-4cc3-8a7e-c1d31e1b4657",
   "metadata": {
    "id": "0c8e8dd7-d46a-4cc3-8a7e-c1d31e1b4657"
   },
   "source": [
    "- As we can see based on the output above, all batches have a batch size of 8 but a different length, as expected\n",
    "- Let's also double-check that the inputs contain the `<|endoftext|>` padding tokens corresponding to token ID 50256 by printing the contents of the first training example in the `inputs` batch"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "21b8fd02-014f-4481-9b71-5bfee8f9dfcd",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "21b8fd02-014f-4481-9b71-5bfee8f9dfcd",
    "outputId": "1b8ad342-2b5b-4f12-ad1a-3cb2a6c712ff"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([21106,   318,   281, 12064,   326,  8477,   257,  4876,    13, 19430,\n",
      "          257,  2882,   326, 20431, 32543,   262,  2581,    13,   198,   198,\n",
      "        21017, 46486,    25,   198, 30003,  6525,   262,  6827,  1262,   257,\n",
      "          985,   576,    13,   198,   198, 21017, 23412,    25,   198,   464,\n",
      "         5156,   318,   845, 13779,    13,   198,   198, 21017, 18261,    25,\n",
      "          198,   464,  5156,   318,   355, 13779,   355,   257,  4936,    13,\n",
      "        50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256],\n",
      "       device='mps:0')\n"
     ]
    }
   ],
   "source": [
    "print(inputs[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5f1f3647-8971-4006-89e0-6a2a1ec1d360",
   "metadata": {
    "id": "5f1f3647-8971-4006-89e0-6a2a1ec1d360"
   },
   "source": [
    "- Similarly, we visually double-check that the targets contain the -100 placeholder tokens"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "51649ab4-1a7e-4a9e-92c5-950a24fde211",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "51649ab4-1a7e-4a9e-92c5-950a24fde211",
    "outputId": "5e8c23f8-6a05-4c13-9f92-373b75b57ea6"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([  318,   281, 12064,   326,  8477,   257,  4876,    13, 19430,   257,\n",
      "         2882,   326, 20431, 32543,   262,  2581,    13,   198,   198, 21017,\n",
      "        46486,    25,   198, 30003,  6525,   262,  6827,  1262,   257,   985,\n",
      "          576,    13,   198,   198, 21017, 23412,    25,   198,   464,  5156,\n",
      "          318,   845, 13779,    13,   198,   198, 21017, 18261,    25,   198,\n",
      "          464,  5156,   318,   355, 13779,   355,   257,  4936,    13, 50256,\n",
      "         -100,  -100,  -100,  -100,  -100,  -100,  -100,  -100,  -100],\n",
      "       device='mps:0')\n"
     ]
    }
   ],
   "source": [
    "print(targets[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d6aad445-8f19-4238-b9bf-db80767fb91a",
   "metadata": {
    "id": "d6aad445-8f19-4238-b9bf-db80767fb91a"
   },
   "source": [
    "## 7.5 Loading a pretrained LLM"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5a5c07d1-4fc9-4846-94cf-b11a085a667b",
   "metadata": {
    "id": "5a5c07d1-4fc9-4846-94cf-b11a085a667b"
   },
   "source": [
    "- In this section, we load a pretrained GPT model using the same code that we used in section 5.5 of chapter 5 and section 6.4 in chapter 6"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8d1b438f-88af-413f-96a9-f059c6c55fc4",
   "metadata": {
    "id": "8d1b438f-88af-413f-96a9-f059c6c55fc4"
   },
   "source": [
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/15.webp?1\" width=500px>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8c68eda7-e02e-4caa-846b-ca6dbd396ca2",
   "metadata": {
    "id": "8c68eda7-e02e-4caa-846b-ca6dbd396ca2"
   },
   "source": [
    "- However, instead of loading the smallest 124 million parameter model, we load the medium version with 355 million parameters since the 124 million model is too small for achieving qualitatively reasonable results via instruction finetuning"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "0d249d67-5eba-414e-9bd2-972ebf01329d",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "0d249d67-5eba-414e-9bd2-972ebf01329d",
    "outputId": "386ebd49-51d7-4a62-c590-91cdccce5fb8"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "File already exists and is up-to-date: gpt2/355M/checkpoint\n",
      "File already exists and is up-to-date: gpt2/355M/encoder.json\n",
      "File already exists and is up-to-date: gpt2/355M/hparams.json\n",
      "File already exists and is up-to-date: gpt2/355M/model.ckpt.data-00000-of-00001\n",
      "File already exists and is up-to-date: gpt2/355M/model.ckpt.index\n",
      "File already exists and is up-to-date: gpt2/355M/model.ckpt.meta\n",
      "File already exists and is up-to-date: gpt2/355M/vocab.bpe\n"
     ]
    }
   ],
   "source": [
    "from gpt_download import download_and_load_gpt2\n",
    "from previous_chapters import GPTModel, load_weights_into_gpt\n",
    "# If the `previous_chapters.py` file is not available locally,\n",
    "# you can import it from the `llms-from-scratch` PyPI package.\n",
    "# For details, see: https://github.com/rasbt/LLMs-from-scratch/tree/main/pkg\n",
    "# E.g.,\n",
    "# from llms_from_scratch.ch04 import GPTModel\n",
    "# from llms_from_scratch.ch05 import download_and_load_gpt2, load_weights_into_gpt\n",
    "\n",
    "\n",
    "BASE_CONFIG = {\n",
    "    \"vocab_size\": 50257,     # Vocabulary size\n",
    "    \"context_length\": 1024,  # Context length\n",
    "    \"drop_rate\": 0.0,        # Dropout rate\n",
    "    \"qkv_bias\": True         # Query-key-value bias\n",
    "}\n",
    "\n",
    "model_configs = {\n",
    "    \"gpt2-small (124M)\": {\"emb_dim\": 768, \"n_layers\": 12, \"n_heads\": 12},\n",
    "    \"gpt2-medium (355M)\": {\"emb_dim\": 1024, \"n_layers\": 24, \"n_heads\": 16},\n",
    "    \"gpt2-large (774M)\": {\"emb_dim\": 1280, \"n_layers\": 36, \"n_heads\": 20},\n",
    "    \"gpt2-xl (1558M)\": {\"emb_dim\": 1600, \"n_layers\": 48, \"n_heads\": 25},\n",
    "}\n",
    "\n",
    "CHOOSE_MODEL = \"gpt2-medium (355M)\"\n",
    "\n",
    "BASE_CONFIG.update(model_configs[CHOOSE_MODEL])\n",
    "\n",
    "model_size = CHOOSE_MODEL.split(\" \")[-1].lstrip(\"(\").rstrip(\")\")\n",
    "settings, params = download_and_load_gpt2(\n",
    "    model_size=model_size,\n",
    "    models_dir=\"gpt2\"\n",
    ")\n",
    "\n",
    "model = GPTModel(BASE_CONFIG)\n",
    "load_weights_into_gpt(model, params)\n",
    "model.eval();"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dbf3afed-bc8e-4d3a-ad9d-eb6f57bb7af5",
   "metadata": {
    "id": "dbf3afed-bc8e-4d3a-ad9d-eb6f57bb7af5"
   },
   "source": [
    "- Before we start finetuning the model in the next section, let's see how it performs on one of the validation tasks"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "7bd32b7c-5b44-4d25-a09f-46836802ca74",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "7bd32b7c-5b44-4d25-a09f-46836802ca74",
    "outputId": "c1276a91-e7da-495b-be0f-70a96872dbe6"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
      "\n",
      "### Instruction:\n",
      "Convert the active sentence to passive: 'The chef cooks the meal every day.'\n"
     ]
    }
   ],
   "source": [
    "torch.manual_seed(123)\n",
    "\n",
    "input_text = format_input(val_data[0])\n",
    "print(input_text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "2e3e68e0-2627-4c65-b4e7-1e0667e4f6fa",
   "metadata": {
    "id": "2e3e68e0-2627-4c65-b4e7-1e0667e4f6fa"
   },
   "outputs": [],
   "source": [
    "from previous_chapters import (\n",
    "    generate,\n",
    "    text_to_token_ids,\n",
    "    token_ids_to_text\n",
    ")\n",
    "# Alternatively:\n",
    "# from llms_from_scratch.ch05 import (\n",
    "#    generate,\n",
    "#    text_to_token_ids,\n",
    "#    token_ids_to_text\n",
    "# )\n",
    "\n",
    "\n",
    "token_ids = generate(\n",
    "    model=model,\n",
    "    idx=text_to_token_ids(input_text, tokenizer),\n",
    "    max_new_tokens=35,\n",
    "    context_size=BASE_CONFIG[\"context_length\"],\n",
    "    eos_id=50256,\n",
    ")\n",
    "generated_text = token_ids_to_text(token_ids, tokenizer)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "36e2fda5-f796-4954-8f72-1dd1123e3344",
   "metadata": {
    "id": "36e2fda5-f796-4954-8f72-1dd1123e3344"
   },
   "source": [
    "- Note that the `generate` function we used in previous chapters returns the combined input and output text, which was convenient in the previous section for creating legible text\n",
    "- To isolate the response, we can subtract the length of the instruction from the start of the `generated_text`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "ba4a55bf-a245-48d8-beda-2838a58fb5ba",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "ba4a55bf-a245-48d8-beda-2838a58fb5ba",
    "outputId": "3e231f03-c5dc-4397-8778-4995731176a3"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The chef cooks the meal every day.\n",
      "\n",
      "### Instruction:\n",
      "\n",
      "Convert the active sentence to passive: 'The chef cooks the\n"
     ]
    }
   ],
   "source": [
    "response_text = (\n",
    "    generated_text[len(input_text):]\n",
    "    .replace(\"### Response:\", \"\")\n",
    "    .strip()\n",
    ")\n",
    "print(response_text)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d44080b2-a4c5-4520-a797-549519f66a3e",
   "metadata": {
    "id": "d44080b2-a4c5-4520-a797-549519f66a3e"
   },
   "source": [
    "- As we can see, the model is not capable of following the instructions, yet; it creates a \"Response\" section but it simply repeats the original input sentence as well as the instruction"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "70d27b9d-a942-4cf5-b797-848c5f01e723",
   "metadata": {
    "id": "70d27b9d-a942-4cf5-b797-848c5f01e723"
   },
   "source": [
    "## 7.6 Finetuning the LLM on instruction data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "314b2a39-88b4-44d8-8c85-1c5b0cd6cc4a",
   "metadata": {
    "id": "314b2a39-88b4-44d8-8c85-1c5b0cd6cc4a"
   },
   "source": [
    "- In this section, we finetune the model\n",
    "\n",
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/16.webp\" width=500px>\n",
    "\n",
    "- Note that we can reuse all the loss calculation and training functions that we used in previous chapters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "65444865-df87-4d98-9faf-875e1c4be860",
   "metadata": {
    "id": "65444865-df87-4d98-9faf-875e1c4be860"
   },
   "outputs": [],
   "source": [
    "from previous_chapters import (\n",
    "    calc_loss_loader,\n",
    "    train_model_simple\n",
    ")\n",
    "# Alternatively:\n",
    "# from llms_from_scratch.ch05 import (\n",
    "#    calc_loss_loader,\n",
    "#    train_model_simple,\n",
    "# )\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "00083059-aa41-4d37-8a17-1c72d1b1ca00",
   "metadata": {
    "id": "00083059-aa41-4d37-8a17-1c72d1b1ca00"
   },
   "source": [
    "- Let's calculate the initial training and validation set loss before we start training (as in previous chapters, the goal is to minimize the loss)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "d99fc6f8-63b2-43da-adbb-a7b6b92c8dd5",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "d99fc6f8-63b2-43da-adbb-a7b6b92c8dd5",
    "outputId": "a3f5e1b0-093a-4c51-e7fc-c9cac48c2ea2"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training loss: 3.8259105682373047\n",
      "Validation loss: 3.7619349479675295\n"
     ]
    }
   ],
   "source": [
    "model.to(device)\n",
    "\n",
    "torch.manual_seed(123)\n",
    "\n",
    "with torch.no_grad():\n",
    "    train_loss = calc_loss_loader(train_loader, model, device, num_batches=5)\n",
    "    val_loss = calc_loss_loader(val_loader, model, device, num_batches=5)\n",
    "\n",
    "print(\"Training loss:\", train_loss)\n",
    "print(\"Validation loss:\", val_loss)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12a6da8f-15b3-42b0-a136-619b7a35c3e9",
   "metadata": {
    "id": "12a6da8f-15b3-42b0-a136-619b7a35c3e9"
   },
   "source": [
    "- Note that the training is a bit more expensive than in previous chapters since we are using a larger model (355 million instead of 124 million parameters)\n",
    "- The runtimes for various devices are shown for reference below (running this notebook on a compatible GPU device requires no changes to the code)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "db4b57fb-e689-4550-931c-6d34a932487c",
   "metadata": {
    "id": "db4b57fb-e689-4550-931c-6d34a932487c"
   },
   "source": [
    "<div style=\"text-align: left;\">\n",
    "    \n",
    "| Model              | Device                | Runtime for 2 Epochs |\n",
    "|--------------------|-----------------------|----------------------|\n",
    "| gpt2-medium (355M) | CPU (M3 MacBook Air)  | 15.78 minutes        |\n",
    "| gpt2-medium (355M) | GPU (M3 MacBook Air)  | 10.77 minutes        |\n",
    "| gpt2-medium (355M) | GPU (L4)              | 1.83 minutes         |\n",
    "| gpt2-medium (355M) | GPU (A100)            | 0.86 minutes         |\n",
    "| gpt2-small (124M)  | CPU (M3 MacBook Air)  | 5.74 minutes         |\n",
    "| gpt2-small (124M)  | GPU (M3 MacBook Air)  | 3.73 minutes         |\n",
    "| gpt2-small (124M)  | GPU (L4)              | 0.69 minutes         |\n",
    "| gpt2-small (124M)  | GPU (A100)            | 0.39 minutes         |\n",
    "\n",
    "</div>\n",
    "\n",
    "- I ran this notebook using the `\"gpt2-medium (355M)\"` model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "78bcf83a-1fff-4540-97c1-765c4016d5e3",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "78bcf83a-1fff-4540-97c1-765c4016d5e3",
    "outputId": "ecb9a3dd-97c0-492d-8a51-fbd175bb139b"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Ep 1 (Step 000000): Train loss 2.637, Val loss 2.626\n",
      "Ep 1 (Step 000005): Train loss 1.174, Val loss 1.103\n",
      "Ep 1 (Step 000010): Train loss 0.872, Val loss 0.944\n",
      "Ep 1 (Step 000015): Train loss 0.857, Val loss 0.906\n",
      "Ep 1 (Step 000020): Train loss 0.776, Val loss 0.881\n",
      "Ep 1 (Step 000025): Train loss 0.754, Val loss 0.859\n",
      "Ep 1 (Step 000030): Train loss 0.800, Val loss 0.836\n",
      "Ep 1 (Step 000035): Train loss 0.714, Val loss 0.809\n",
      "Ep 1 (Step 000040): Train loss 0.672, Val loss 0.806\n",
      "Ep 1 (Step 000045): Train loss 0.633, Val loss 0.789\n",
      "Ep 1 (Step 000050): Train loss 0.663, Val loss 0.783\n",
      "Ep 1 (Step 000055): Train loss 0.760, Val loss 0.763\n",
      "Ep 1 (Step 000060): Train loss 0.719, Val loss 0.743\n",
      "Ep 1 (Step 000065): Train loss 0.653, Val loss 0.735\n",
      "Ep 1 (Step 000070): Train loss 0.535, Val loss 0.732\n",
      "Ep 1 (Step 000075): Train loss 0.567, Val loss 0.736\n",
      "Ep 1 (Step 000080): Train loss 0.602, Val loss 0.731\n",
      "Ep 1 (Step 000085): Train loss 0.513, Val loss 0.715\n",
      "Ep 1 (Step 000090): Train loss 0.571, Val loss 0.696\n",
      "Ep 1 (Step 000095): Train loss 0.504, Val loss 0.687\n",
      "Ep 1 (Step 000100): Train loss 0.507, Val loss 0.682\n",
      "Ep 1 (Step 000105): Train loss 0.568, Val loss 0.674\n",
      "Ep 1 (Step 000110): Train loss 0.562, Val loss 0.669\n",
      "Ep 1 (Step 000115): Train loss 0.519, Val loss 0.665\n",
      "Below is an instruction that describes a task. Write a response that appropriately completes the request.  ### Instruction: Convert the active sentence to passive: 'The chef cooks the meal every day.'  ### Response: The meal is prepared every day by the chef.<|endoftext|>The following is an instruction that describes a task. Write a response that appropriately completes the request.  ### Instruction: Convert the active sentence to passive:\n",
      "Ep 2 (Step 000120): Train loss 0.437, Val loss 0.670\n",
      "Ep 2 (Step 000125): Train loss 0.454, Val loss 0.686\n",
      "Ep 2 (Step 000130): Train loss 0.447, Val loss 0.681\n",
      "Ep 2 (Step 000135): Train loss 0.406, Val loss 0.677\n",
      "Ep 2 (Step 000140): Train loss 0.407, Val loss 0.676\n",
      "Ep 2 (Step 000145): Train loss 0.373, Val loss 0.677\n",
      "Ep 2 (Step 000150): Train loss 0.381, Val loss 0.674\n",
      "Ep 2 (Step 000155): Train loss 0.419, Val loss 0.676\n",
      "Ep 2 (Step 000160): Train loss 0.414, Val loss 0.686\n",
      "Ep 2 (Step 000165): Train loss 0.380, Val loss 0.688\n",
      "Ep 2 (Step 000170): Train loss 0.327, Val loss 0.679\n",
      "Ep 2 (Step 000175): Train loss 0.338, Val loss 0.668\n",
      "Ep 2 (Step 000180): Train loss 0.390, Val loss 0.657\n",
      "Ep 2 (Step 000185): Train loss 0.417, Val loss 0.659\n",
      "Ep 2 (Step 000190): Train loss 0.340, Val loss 0.650\n",
      "Ep 2 (Step 000195): Train loss 0.326, Val loss 0.635\n",
      "Ep 2 (Step 000200): Train loss 0.310, Val loss 0.632\n",
      "Ep 2 (Step 000205): Train loss 0.353, Val loss 0.628\n",
      "Ep 2 (Step 000210): Train loss 0.367, Val loss 0.628\n",
      "Ep 2 (Step 000215): Train loss 0.393, Val loss 0.635\n",
      "Ep 2 (Step 000220): Train loss 0.300, Val loss 0.648\n",
      "Ep 2 (Step 000225): Train loss 0.346, Val loss 0.663\n",
      "Ep 2 (Step 000230): Train loss 0.299, Val loss 0.657\n",
      "Below is an instruction that describes a task. Write a response that appropriately completes the request.  ### Instruction: Convert the active sentence to passive: 'The chef cooks the meal every day.'  ### Response: The meal is cooked everyday by the chef.<|endoftext|>The following is an instruction that describes a task. Write a response that appropriately completes the request.  ### Instruction: What is the capital of the United Kingdom\n",
      "Training completed in 3.35 minutes.\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "\n",
    "start_time = time.time()\n",
    "\n",
    "torch.manual_seed(123)\n",
    "\n",
    "optimizer = torch.optim.AdamW(model.parameters(), lr=0.00005, weight_decay=0.1)\n",
    "\n",
    "num_epochs = 2\n",
    "\n",
    "train_losses, val_losses, tokens_seen = train_model_simple(\n",
    "    model, train_loader, val_loader, optimizer, device,\n",
    "    num_epochs=num_epochs, eval_freq=5, eval_iter=5,\n",
    "    start_context=format_input(val_data[0]), tokenizer=tokenizer\n",
    ")\n",
    "\n",
    "end_time = time.time()\n",
    "execution_time_minutes = (end_time - start_time) / 60\n",
    "print(f\"Training completed in {execution_time_minutes:.2f} minutes.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "Ise3wGjlB-iq",
   "metadata": {
    "id": "Ise3wGjlB-iq"
   },
   "source": [
    "- As we can see based on the outputs above, the model trains well, as we can tell based on the decreasing training loss and validation loss values\n",
    "- Furthermore, based on the response text printed after each epoch, we can see that the model correctly follows the instruction to convert the input sentence `'The chef cooks the meal every day.'` into passive voice `'The meal is cooked every day by the chef.'` (We will properly format and evaluate the responses in a later section)\n",
    "- Finally, let's take a look at the training and validation loss curves"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "4acd368b-1403-4807-a218-9102e35bfdbb",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 306
    },
    "id": "4acd368b-1403-4807-a218-9102e35bfdbb",
    "outputId": "2f5c99e0-7ed0-4f42-d67c-e07c375e6158"
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAeoAAAEiCAYAAAA21pHjAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjcsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvTLEjVAAAAAlwSFlzAAAPYQAAD2EBqD+naQAAUK5JREFUeJzt3Qd4VGXaBuAnHRKSkABJ6F1678UKAoIoKKDoKuKvriIqi5V17QULKhYUWVexgF0QEUQEEWkivSO9h5be2/zX801mMgkhJmSSmUye+7oO087MKRPmPV99vSwWiwUiIiLilrxdvQMiIiJyfgrUIiIibkyBWkRExI0pUIuIiLgxBWoRERE3pkAtIiLixhSoRURE3JgCtYiIiBtToBYREXFjCtQiHuTgwYPw8vLCpk2bXL0rIuIkCtQiboaBtqjl6aefdvUuikg58i3PjYnI3ztx4oT9/pdffoknn3wSu3fvtj9XrVo1nUaRSkQlahE3ExUVZV9CQ0NNKdr2OCIiAq+//jrq1auHgIAAdOzYET/99NN5Pys7Oxu33347WrZsicOHD5vnvv/+e3Tu3BlVqlRBkyZN8MwzzyArK8v+Hm7vgw8+wPDhwxEYGIjmzZtj3rx59tdjY2Nx8803o1atWqhatap5/aOPPjrvPnzzzTdo166dWbdGjRro378/kpOT7a9zW61atTL7w/189913873/yJEjGDVqFKpXr47w8HBce+21porf5rbbbsOwYcMwZcoU1K5d22zj3nvvRWZm5gWcfRE3xOxZIuKePvroI0toaKj98euvv24JCQmxfP7555Zdu3ZZHnnkEYufn5/lr7/+Mq8fOHCA2fAsGzdutKSlpVmGDx9u6dSpk+XUqVPm9eXLl5v3z5w507Jv3z7Lzz//bGnUqJHl6aeftm+D769Xr55l9uzZlj179ljuv/9+S7Vq1Sxnz541r997772Wjh07Wv7880+zvcWLF1vmzZtX6P4fP37c4uvra/ab627ZssUybdo0S2Jionn9s88+s9SuXdvy7bffWvbv329uw8PDzf5RRkaGpVWrVpbbb7/dvHfHjh2Wm266ydKiRQtLenq6WWfMmDHmmO6++27Lzp07LT/88IMlMDDQMmPGjDL7XkTKkwK1SAUK1HXq1LG88MIL+dbp1q2bZdy4cfkC9e+//27p16+fpW/fvpa4uDj7unzuxRdfzPf+Tz/91ARLG77/P//5j/1xUlKSeW7hwoXm8dChQy1jx44t1v6vX7/evPfgwYOFvt60aVNzQeDoueees/Tq1cu+bwzKOTk59tcZoKtWrWpZtGiRPVA3bNjQkpWVZV9n5MiRlhtuuKFY+yji7tRGLVJBJCQk4Pjx4+jTp0++5/l48+bN+Z4bPXq0qR5funSpqXK24XorV67ECy+8kK96PC0tDSkpKaaqm9q3b29/PSgoCCEhITh16pR5fM899+D666/Hhg0bMGDAAFPt3Lt370L3uUOHDujXr5+p+h44cKBZf8SIEQgLCzPV3/v27cP//d//4c4777S/h9XwrPK37e/evXsRHByc73O5v3yvTZs2beDj42N/zCrwrVu3FvvcirgzBWoRDzR48GB89tlnWL16Na644gr780lJSaZN+rrrrjvnPWwjtvHz88v3Gtutc3JyzP2rrroKhw4dwoIFC7B48WITiNkmzDbighg8uc6qVavw888/4+2338bjjz+OP/74w35R8N///hc9evQ45322/e3SpQtmzZp1zmezjbw4+ytS0SlQi1QQLNXWqVPHlIgvvfRS+/N83L1793zrstTbtm1bXHPNNfjxxx/t67MTGXuQN2vWrFT7wiA5ZswYs1x88cV4+OGHCw3UtqDJUj8X9mBv2LAh5syZg4kTJ5rj2b9/v+mcVhjuL3u+sxMdj1+kMlKgFqlAGBCfeuopNG3a1PT4Zm9rTm5SWInzvvvuM9XaV199NRYuXIi+ffuaQMnHDRo0MFXQ3t7epnp527ZteP7554u1D/wMlnJZ3Zyeno758+ebXtuFYcl5yZIlpsqbwZaPT58+bV+fpfv777/fVHUPGjTIfN66detMz3IGcgbwV1991fT0fvbZZ011Pkvz3333HR555BHzWMTTKVCLVCAMavHx8XjwwQdNm3Hr1q3N0CkOkSrMhAkTTBUwq8I5jIvtxAysDHovv/yyqTLmkKg77rij2Pvg7++PSZMmmSFSbP9mifqLL74odF2WgpcvX46pU6eaNnaWpl977TVTfU7cLqvAGYx5EcL2cLZnc7+Jr/H9jz76qKmuT0xMRN26dU11u0rYUll4sUeZq3dCRERECqcJT0RERNyYArWIiIgbU6AWERFxYwrUIiIibkyBWkRExI0pUIuIiLgxBeoLMG3aNDRq1MhMucipD9euXQt3MnnyZHTr1s3Mj8xJJjgXs2M+Y9tcyZz2kSkBmd+YczefPHky3zpMizhkyBAzlpWfw3GujukQadmyZWb2KKZc5GxXM2fOdOn5eumll8xMWLZxuJ54rMeOHcM//vEPczwcx8xxx5wkxIYjLjkpCee75utMK7lnz558nxETE2MmE+FYZKaP5HzbnK7T0ZYtW8wYaR5L/fr18corr5yzL19//bUZh811uB+cVtRZOFnLE088gcaNG5vj4CQvzz33nDk+TzhWjg8fOnSomZ2Nf7Nz587N97o7HVtx9uVCj5XpSDlOntvlOHquc+utt5p57SvisZYJV2cFqWi++OILi7+/v+XDDz+0bN++3XLnnXdaqlevbjl58qTFXQwcONBkXdq2bZtl06ZNlsGDB1saNGhgsiDZMCVg/fr1LUuWLLGsW7fO0rNnT0vv3r3trzMTUdu2bS39+/c3KRMXLFhgqVmzpmXSpEn2dZiWkOkEJ06caNIPvv322xYfHx/LTz/95JLztXbtWpOysX379pYHHnjAI481JibGZIq67bbbLH/88YfZL2aR2rt3r32dl156yWTcmjt3rmXz5s2Wa665xtK4cWNLamqqfZ1BgwZZOnToYFmzZo3JtNWsWTPL6NGj7a/Hx8dbIiMjLTfffLP5O2JaTWasev/99+3rrFy50pyDV155xZwTZtxiys2tW7c65ViZJaxGjRqW+fPnm6xgX3/9tUm3+eabb3rEsfLv7PHHH7d89913JsPYnDlz8r3uTsdWnH250GNldjf+3/vyyy9N6tbVq1dbunfvbunSpUu+zxhUQY61LChQlxD/gJiP1yY7O9ukHpw8ebLFXTEXMf9z/Pbbb/b/GPzj5A+fDfP4ch3+J7H9x/L29rZER0fb13nvvfdM3l9bHmDmQm7Tpk2+bTG1IC8Uyvt8Mb9x8+bNTW7kSy+91B6oPe1YH330UZO68nyYDjIqKsry6quv2p/jOQgICDA/XMQfKB4/80nbMIWll5eX5dixY+bxu+++awkLC7Mfv23bTDlpM2rUKMuQIUPybb9Hjx6Wf/7zn045Vn4281A7uu6668wPsacda8Hg5U7HVpx9Kc2xnu+im+sdOnSoQh+rs6jquwQyMjKwfv16UxViw7mS+ZhZitwVp5yk8PBwc8tjYHWT43GwKojzP9uOg7esFoqMjLSvw+knOQ3k9u3b7es4foZtHdtnlOf5YtU2q64L7o+nHSunC+3atStGjhxpqug7depksk/ZHDhwANHR0fn2g/Nosxre8XhZdcjPseH63F/OxW1b55JLLjHThToeL5tQOA93cc5JaTF1JucJ/+uvv8xjzkm+YsUK+/SjnnSsBbnTsRVnX8riN4tV5Dw+Tz/W4lCgLoEzZ86YdjPHH3TiY3657ojzPLO9lpmLmE2JuK/8Y7b9JyjsOHhb2HHaXitqHQa41NTUcjtfnGeauZHZNl+Qpx0rM0299957Zm7vRYsWmSxZnP/7448/zre/Re0HbxnkHfn6+poLOWecE2cd72OPPYYbb7zRXFhxTnJelPBv2ZZpy5OOtSB3Orbi7IszsU8J26yZU902n3u0hx5rcSkph4djSZOZkVgS8URHjhzBAw88YHIeO+ZT9lS88GKp4sUXXzSPGbz4/U6fPt2knPQkX331lckKNnv2bJOpi1nCGKjZ2cjTjlWsWPs1atQo06GLF6RipRJ1CdSsWdMktC/YY5iPo6Ki4G7Gjx9vMiX9+uuv+dIBcl9ZVRsXF3fe4+BtYcdpe62odXgVzN6S5XG+WN3MLFLsjc0rbC6//fYb3nrrLXOfV8KecqzEnqjMmOWIKSPZa91xf4vaD97ynDliD3f2qnXGOXHW8bLnva1UzaaJW265Bf/617/sNSeedKwFudOxFWdfnBmkmcaUF96O2dGiPOxYS0qBugRYhco8vGw3cyzh8HGvXr3gLng1yiA9Z84cLF261AxvccRjYFWi43GwHYc/9rbj4O3WrVvz/eew/eexBQqu4/gZtnVsn1Ee54vpDrmfLG3ZFpY4WT1qu+8px0pswig41I5tuEwfSfyu+YPiuB+snmc7nuPx8sKFFzk2/Dvh/rItzrYOh9Twx9PxeFu0aIGwsLBinZPSSklJMW2QjngxxP30tGMtyJ2OrTj74qwgzWFQv/zyixl66KiXBx3rBXFZN7YKikNw2ANw5syZpifiXXfdZYbgOPYYdrV77rnHDC9YtmyZ5cSJE/YlJSUl35AlDtlaunSpGbLUq1cvsxQcsjRgwAAzxIvDkGrVqlXokKWHH37Y9KSeNm1aoUOWyvt8Ofb69rRjZW9YX19fM3Rpz549llmzZpn9+uyzz/INL+F2v//+e8uWLVss1157baHDejp16mSGeK1YscL0mHcc6sKerhzqcsstt5ihLjw2bqfgUBfuy5QpU8w5eeqpp5w6PGvMmDGWunXr2odncWgPh82xB74nHCtHKnA4IBf+FL/++uvmvq2nszsdW3H25UKPNSMjwwyBqlevnvn/5/ib5diDe1AFOdayoEB9ATiGlj/8HDPLITkc1+dO+B+hsIVjq234Rzdu3DgznIF/zMOHDzf/MRwdPHjQctVVV5mxiPyBfPDBBy2ZmZn51vn1118tHTt2NOeiSZMm+bbhqvNVMFB72rH+8MMP5sKCFwUtW7a0zJgxI9/rHGLyxBNPmB8trtOvXz/L7t27861z9uxZ8yPHcckchjZ27FjzY+qIY0g5FIyfwYDJH7CCvvrqK8tFF11kjpfD13788UenHWdCQoL5Hnk+q1SpYs45x+I6/nhX5GPl31Nh/095geJux1acfbnQY+VF2Pl+s/i+inasZcGL/7iuPC8iIiJFURu1iIiIG1OgFhERcWMK1CIiIm5MgVpERMSNKVCLiIi4MQVqERERN6ZAfYHS09Px9NNPm1tPV5mOtbIdr47Vc+m79RwaR32BOK0c058xHZvjnLSeqDIda2U7Xh2r59J36zlUohYREXFjCtQiIiJurNLlo2ZqtI0bN5r0hwUz85REYmKiuT127JipYvJklelYK9vx6lg9l75b98bMX0yfyZzyTMlblErXRv3nn3+ie/furt4NERERrF27Ft26dSvyTFS6EjVL0raTU7t2bVfvjoiIVEInTpwwhUZbTCpKpQvUtupuBul69eq5endERKQS8y5GE6w6k4mIiLgxBWoRERE3pkAtIiLixipdG7WISFGys7ORmZmpkySl4ufnBx8fHziDAnUpbDsWj+NxqehQvzoiQ6o45QsREdfgSNXo6GjExcXpKxCnqF69OqKiouDl5VWqz1GgLoVn5+/A2gMxeOemTri6fZ1SfREi4lq2IB0REYHAwMBS/7hK5b7oS0lJwalTp8zj0g4FVqAuhUst69DdZzO8TngDCtQiFbq62xaka9So4erdEQ9QtWpVc8tgzb+r0lSDqzNZKVycugQP+X2NoJPrSvMxIuJitjZplqRFnMX291TaPg8K1KWQHRBmvZMSU6ovQUTcg6q7xR3/nhSoSyMw3PplpMU65csQEREpSIG6FLyDrG1ZfhnqJSoinqNRo0aYOnVqsddftmyZKT2WdY/5mTNnmp7UlY1LA/XkyZNN1pDg4GDT2D5s2DDs3r37b78o/kE4LlWquGZolF+1mua2igK1iLhAwd/CgsvTTz99wVkG77rrrmKv37t3b5NkIjQ09IK2J27c6/u3337Dvffea4I180T/+9//xoABA7Bjxw4EBQWd930hISH5Arqr2pUCQq2BOjA73iXbF5HKjcHR5ssvv8STTz6Z77exWrVq+YYMsXf73+U+plq1apVoP/z9/c14YfHAEvVPP/2E2267DW3atEGHDh1Mafnw4cNYv359ke9jYOYfhW0pTpqwshAUGmFug3MSXLJ9EancHH8HWZp1/G3ctWuXqa1cuHAhunTpgoCAAKxYsQL79u3Dtddea343GchZUPrll1+KrPrm537wwQcYPny46cncvHlzzJs377xV37Yq6kWLFqFVq1ZmO4MGDcp3YcHC2f3332/W45C4Rx99FGPGjDE1qyXx3nvvoWnTpuZioUWLFvj000/zXZywVqFBgwbm+OvUqWO2afPuu++aY2GtLM/HiBEj4I7cqo06Pt5aMg0Pt3bSOp+kpCQ0bNgQ9evXN39w27dvhytUC7dedYYiCakZ2S7ZBxEpw0krMrJcsnDbzvLYY4/hpZdews6dO9G+fXvz+zl48GAsWbIEGzduNAF06NChppBUlGeeeQajRo3Cli1bzPtvvvlmxMScf8QLJ/yYMmWKCZzLly83n//QQw/ZX3/55Zcxa9YsfPTRR1i5ciUSEhIwd+7cEh3bnDlz8MADD+DBBx/Etm3b8M9//hNjx47Fr7/+al7/9ttv8cYbb+D999/Hnj17zOe3a9fOvLZu3ToTtJ999llTC8GC4yWXXAJ35DYTnuTk5GDChAno06cP2rZte971eMX04Ycfmj84Bnb+IbB9hMG6sPzS6enpZrFJTEx02j4HVbeWqKt5peFYQiLq1qx8nRxEPFVqZjZaP7nIJdve8exABPo75+eZgejKK6+0P2ZBiDWYNs8995wJeCwhjx8//ryfw9rP0aNHm/svvvgi3nrrLaxdu9YE+sJw7PD06dNNaZf42dwXm7fffhuTJk0ypXR65513sGDBghId25QpU8x+jRs3zjyeOHEi1qxZY56//PLLzcUBaxf69+9v5t5mybp79+5mXb7GJtarr77a1Dyw8NepUye4I7cpUbOtmldEX3zxRZHr9erVC7feeis6duyISy+9FN99951pT+EV0/k6rLFKyLa0bt3aafvsVaU6snJPYWLMSad9roiIs3Tt2jXfY5aoWbJllTSrnVktzdL235WoWTiyYYBjXyHbFJmFYRW5LUjbptG0rc9C1smTJ+1BkzhzF6voS2Lnzp2mcOeIj/k8jRw5EqmpqWjSpAnuvPNOc0HCKnfixQuDM1+75ZZbTOmetQDuyC1K1LzSmj9/vqkeKaxUXBReJfEqaO/evYW+zis2XmXZHDt2zHnB2ssLSV7BqG6JR1LcaZb3nfO5IuJyVf18TMnWVdt2loIdcxmkFy9ebEqdzZo1M1Ndsm02IyPjb39rHbFNmjWhJVnfmVX6xcHmUVZrsw2ex8yS96uvvmo6MrMUvWHDBtO+/vPPP5uOeGzPZo93dxsC5tISNb80Bmle5SxduhSNGzcu8WewF+PWrVvPO+k5OxDwys+28MtxpmSfEHObFs9ALSKegoGF1c+uWMpyJAvbg1ldzCpntteyavjgwYMoT6zdZOctBkXH33IGzpJo1aqVOR5HfOxYGOOFCNvgWVXPoLx69WoTM4g94Fkt/sorr5i2d54HxiJ34+vq6u7Zs2fj+++/NwGU2WtsX6JtQnNWc9etW9dUYRPbOHr27GmuBNnDkFdHhw4dwh133OGSYzgV0AgJGV5ISFdnMhFxf+zlzCZDBi9eEDzxxBNFlozLyn333Wd+1/lb3rJlS9NmHRsbW6KLlIcffth0cGOtKgPuDz/8YI7N1oudvc95AdCjRw9TFf/ZZ5+Z2MIqb9bi7t+/33QgCwsLM+3jPA/sB+VuXBqo2a2eLrvssnzPsxcgr/iI7Sbe3nkFf36RbGtgUOfJZZvGqlWrnNr2XBLfNp+Mz9Ycxv3+zTDYJXsgIlJ8r7/+Om6//XbTCbdmzZpmWBR7XJc3bpe/4yyMsX2aE6wMHDiwRFmmhg0bhjfffNNU47P3N2tlGT9sMYVV2OzxzuZPBmzWIDCYczgYX2NQZ3V3WlqauYD5/PPPzXBhd+NlKe9GAxc7evSoabc4cuRIidvDC/P6z7vx1tK9uKVnQzw37Py91UXEffGH+sCBA+aH3lUzHVZ2LM2yKpslZPZE9/S/q6MliEVu0ZmsIgsL8je3MSlFd8QQEZE8bLJkJy6O3uEQWg7PYlC76aabdJrcdXhWRdU+5ics8X8QQ4+96epdERGpMNikyTZkzozGIVXs4MW2ZZaqJT+VqEsp2CcbTb1P4HT6sdJ+lIhIpcFq34I9tqVwCtSllNPsStzwexoy/aLwXWk/TEREpAAF6lIKjmiAPyyt4J/ibcaFuyqTl4iIeCa1UZdSeKC1M1lGdg6SlZhDREScTCXqUqrqnYWx/r+gWnYCYhMvQbUA5858JiIilZsCdWl5eeMp7w9N3cS22MdRv6YCtYiIOI+qvkvLxw/JXoHmblLc+TPJiIiIXAgFaidI9lZiDhGpuDjl5oQJE+yPGzVqhKlTpxb5HnacnTt3bqm37azPKQqnCWVq5IpKgdoJUv1CzW1GwhlnfJyISLEwscagQYMKfe333383QZBZoUqKWa0493Z5BMsTJ07gqquucuq2PI0CtRNk+ltzl2Yln3XGx4mIFMv//d//mTzLnDe6ICan6Nq1K9q3b1/is1mrVi2Tbao8MM0m0xHL+SlQO0FWQLi5taTEOOPjRESK5eqrrzZBlVNxOkpKSsLXX39tAvnZs2cxevRoky6YwZcZpJglqigFq7737Nlj0kEysQQzFfLioLBsWBdddJHZRpMmTUz6zMzMTPMa9++ZZ57B5s2bTSmfi22fC1Z9cyrRK664wqSjZJaru+66yxyPDTMrMmsWM2bVrl3brMOUybZtFTcBCFMmMxkGLxJY0v/pp5/sr2dkZGD8+PHm83nMTItpS7XM+TJYO9CgQQPz3jp16uD+++9HWVKvbyewVA0ztz6pCtQiHicjueTv8QkAfHJ/XrOzgOx0M0IEflX//nP9g4q9GV9fX5MmkkHv8ccft0+4xCDNtI4M0AxyTAfMQBoSEoIff/wRt9xyC5o2bYru3bsXK6hdd911iIyMxB9//IH4+Ph87dk2wcHBZj8YuBhsmY6Yzz3yyCO44YYbsG3bNhMMbbmiQ0OtTYaOkpOTTarLXr16mer3U6dO4Y477jBB0/Fi5NdffzVBlLd79+41n89gy20WB1Njvvbaa3j//fdNLusPP/wQ11xzDbZv327SXb711luYN28evvrqKxOQmeGKC3377bd444038MUXX5iUmEzVyQuQsqRA7QTeQdYStW96nDM+TkTcyYt1Sv6ekTOBNsOt93f9AHx9G9CwLzD2x7x1prYDUgppLns6vkSbYm7pV199Fb/99ps9DzOrva+//noTDLk89NBD9vXvu+8+LFq0yASh4gRqBtZdu3aZ9zAI04svvnhOu/J//vOffCVybpPBjIGapeNq1aqZCwtWdZ/P7NmzTWrITz75BEFB1guWd955x7TFv/zyy+ZigcLCwszzzF3dsmVLDBkyBEuWLCl2oGZpnBcuN954o3nMz2bQZy3CtGnTcPjwYROw+/btay5+WKK24Ws8hv79+8PPz88E8uKcx9JQ1bcT+FWraW4DMhWoRaR8MVD17t3blAqJJUx2JGO1N7FkzfzOrPIODw83AZNBlwGnOHbu3GkSaNiCNLHEW9CXX35psmAxiHEbDNzF3Ybjtjp06GAP0tSnTx9Tqt+9e7f9OZZkGaRtWLpm6bs4EhIScPz4cfO5jviY27dVr2/atAktWrQw1dpMx2kzcuRIpKammup9XhjMmTMHWVlZKEsqUTtBQIg1UAdmlexKWEQqgH8fv7Cqb5uWQ62fwapvRxO2wlkYlFlSZmmQpWlWazPPM7G0zapelhYZrBkEWXXNdlhnWb16NW6++WbTDs2qa5biWZpm9XJZ8PPzy/eYpV4Gc2fp3LmzyY29cOFCU6MwatQoU4L+5ptvzEULLxr4PNvqx40bZ6/RKLhfzqIStRMEVo8wt9VyEpGTY3HGR4qIu2CbcUkXW/s08T6fc2yfLupzLwADCfM7s+qY1casDre1VzOV5LXXXot//OMfprTKkuBff/1V7M9mfmi2z3IYlc2aNWvyrbNq1SpTPcx2cvY0Z7XxoUOH8h+uv78p3f/dttjey7Zqm5UrV5pjY+nWGdhOz9qBgik2+Zgd5RzXY9v3f//7X1NbwLbpmBhrPyRW5bM6nm3Zy5YtMxcqbJcvKypRO0FQmDVQV/dKRGJaFkIDy+aqSkSkMKxqZlCZNGmSqdpl1a0NgyZLggymbNt9/fXXcfLkyXxBqSgsSbI395gxY0zJkZ/PgOyI22A1N0vR3bp1Mx3WWCXsiO3WLKWySpm9rdnRrOCwLJbKn3rqKbMt9qw+ffq0qSlg5zdb+7QzPPzww2Y7rHlgJzTWQnC/Zs2aZV7nOWJ1Ojua8SKBnfNYpV+9enXTqY0XHD169DA93D/77DMTuB3bsZ1NJWon8A+OQLSlBk5YaiAmxXnVSSIiJan+jo2NNVXPju3JbCtmVS6fZ2czBhwObyouBioGXbbLstMUe2G/8MIL+dZhj+l//etfpnc2Ax8vCjg8yxE7t3Fylssvv9wMKStsiBgDH9vPWXJlwB8xYgT69etnOo45E9udJ06ciAcffNA0B7A3Ont584KDeBHxyiuvmNoB7sfBgwexYMECcy4YrFnKZps2x6izCvyHH34ww8TKipeFg8IqEU4MwDYGVuXwqs5ZLn5lKY7EpOLbe3qjS0PrcC0RqRjY05ilvcaNG5txsyJl/XdVklikErWT81LHJqtELSIizqNA7SRhQdZArapvERFxJgVqJ7k3bgqW+k9ElaOrnPWRIiIiCtTOUjPnDJp4R8OSGK0/KxER8YwSNSc5Z4869rCLiIgwPREdZ585H3aV52w8bJxnjz32xnO1dc3ux6j0J7DRr5Ord0VERDyISwM1Z3Jh1hMOnucML8x+MmDAgHyD3Qtit39ONM+hCBs3bjTBnQsnfHelzKjOWGtphaPp5ZMaTkScz5mzW4nkOOnvyaUTnjimFSMOJGfJev369SalWmE4FR7H4nHAOnEOWwZ5jrObPn06XCU8yDrJSazGUYtUOJw1i2NkOQc0x/jysW1mL5GS4qhnTtHKCVv4d8W/J4+ZmYzp04gTx58Pp2rjQHVHHMjvmM/UFWpnHsUtPj/DO56ZYXq7dF9EpGT4Y8qxrpwmk8FaxBk4gQuza/HvyyMCNasIOFE8Z3tp27bteddj7s+CU8nxMZ8vTHp6ullsEhMTnbjXDvuQsBXP+c3EqrQOACaVyTZEpOyw1MMfVWZC+rs5qUX+DrN7Ma2nM2pm3CZQs62a7cwrVqxweoc1ZnQpa1Wr1zK31XISkJWdA18fjXwTqWj4o8oMSGWVBUnkQrhFNOH8sPPnzzeJu/9uKjXOU8sJ5R3x8fmSkXOSelap25YdO3agLATlZtAKQxLiUzPLZBsiIlL5eLu6wZ1BmhO+L1261LQR/R0mLF+yZEm+59iZrLBE5sTsLExXZls4FKws+Faz5qQO80pUhzIREXEaX1dXdzN/6vfff28CqK2dmUnHmTaMbr31VtStW9dUYdMDDzxgEqIzIfmQIUNMWrV169ZhxowZrjwUoKo1EUc1rzTEJiQDEWVzQSAiIpWLS0vU7733nqmOZuo15v60LUzSbcMcp44Jy3v37m2COwMzk6Azzyp7fBfVAa1cVKmO7NzTmRR7yrX7IiIiHsOlJeriZNhctmzZOc+NHDnSLG7F2xsp3tUQnJOA1ITTrt4bERHxEG7RmcxTpPpWN7fpCWdcvSsiIuIhFKidKN0/1NxmJylQi4iIcyhQO1F2gLVDWU7KWWd+rIiIVGIK1E5kye357Z0a68yPFRGRSkyB2om8AmuYW5/0OGd+rIiIVGIK1E7kE1oHxyw1EJep6QdFRMQ53Gaub0+Q0f0e9Pu9NYLhi7Gu3hkREfEIKlE7UXigNedoYnoWMrKUgF5EREpPgdqJQqr6wTs3o1lcaoYzP1pERCopVX07kU/CUcwNeBo5OdmITb4EEcFVnPnxIiJSCSlQO5OPP9rjL2R7eWFtYioQpcQcIiJSOgrUzhRYA69UfwJ/nvTCWFV9i4iIE6iN2pl8fLE3/DL8aWmJmJRsp360iIhUTgrUThYeZO35HZuszmQiIlJ6qvp2sk6ZG+DvswFeMez+3dzZHy8iIpWMStRO1vvUl3jW72OEn93k7I8WEZFKSIG6jBJzIE2JOUREpPQUqJ3MKzDc3PoqMYeIiDiBArWT+VazZtDyz1AGLRERKT0FaifzD6llbqtmxTv7o0VEpBJSoHayqqHWQB1qSUBapsZSi4iICwL1kSNHcPToUfvjtWvXYsKECZgxYwYqu6q5JerqSEJsisZSi4iICwL1TTfdhF9//dXcj46OxpVXXmmC9eOPP45nn30WlZmtM1mYVyJiNOmJiIi4IlBv27YN3bt3N/e/+uortG3bFqtWrcKsWbMwc+ZMVGq5gdqUqJNUohYRERcE6szMTAQEBJj7v/zyC6655hpzv2XLljhx4gQqtarWQB3glYX4RPX8FhERFwTqNm3aYPr06fj999+xePFiDBo0yDx//Phx1KhhHZ5UafkHIdPLz9xNjTvt6r0REZHKGKhffvllvP/++7jsssswevRodOjQwTw/b948e5V4cSxfvhxDhw5FnTp14OXlhblz5xa5/rJly8x6BRe2k7sNLy+k+oSau+kJCtQiIuKCpBwM0GfOnEFCQgLCwnKnzARw1113ITAwsNifk5ycbIL87bffjuuuu67Y79u9ezdCQkLsjyMiIuBOkqrWRkICkJKa6updERGRyhioU1NTYbFY7EH60KFDmDNnDlq1aoWBAwcW+3Ouuuoqs5QUA3P16tXhrn7q8Smenb8DV6O2q3dFREQqY9X3tddei08++cTcj4uLQ48ePfDaa69h2LBheO+991DWOnbsiNq1a5thYStXrixy3fT0dFPyty2JiYnll5Na46hFRMQVgXrDhg24+OKLzf1vvvkGkZGRplTN4P3WW2+hrDA4sxPbt99+a5b69eubanjuz/lMnjwZoaGh9qV169Yoa2G5gTomObPMtyUiIp7tgqq+U1JSEBwcbO7//PPPpn3Z29sbPXv2NAG7rLRo0cIsNr1798a+ffvwxhtv4NNPPy30PZMmTcLEiRPtj48dO1bmwbrJsXmY6z8NfySwY531gkZERKTcStTNmjUzPbQ5leiiRYswYMAA8/ypU6fydfIqD+xlvnfv3vO+zvHe3CfbYrvAKEvBOfHo6L0PtTMPm7Z8ERGRcg3UTz75JB566CE0atTIBMpevXrZS9edOnVCedq0aZOpEncnAW2G4M6MiXgn8xqkZCgxh4iIlHPV94gRI9C3b18zC5ltDDX169cPw4cPL/bnJCUl5SsNHzhwwATe8PBwNGjQwFRbs6ra1nFt6tSpaNy4sZlwJS0tDR988AGWLl1qLhDcSZXIi7DcuzvSs3LMfN9BARd0mkVERC4sUFNUVJRZbFm06tWrV6LJTmjdunW4/PLL7Y9tbcljxowxc4bzQuDw4cP21zMyMvDggw+a4M3x2u3btzdTmDp+hjvgJCzs+X0iPs30/K4fXvyx5SIiIqUO1Dk5OXj++efNkCyWioltvwyizKDFjmXFwR7bRbXhFkzw8cgjj5jF7WWkYLjvKiT4nEVMcjdX742IiFS2QM1g/L///Q8vvfQS+vTpY55bsWIFnn76aVMl/cILL6BSy0rDI8lTAD9gbtL9nKLF1XskIiKVKVB//PHHpn3YljWLWA1dt25djBs3ToG6Sihy4A1v5CAl7iwHbDnzOxMRkUrkgnp9x8TEmJSWBfE5vlbpefsg1cc6DCxDiTlERKS8AzV7er/zzjvnPM/nWLIWIN3fmkErM+mMToeIiJRv1fcrr7yCIUOGmB7XtjHUq1evNhOgLFiw4ML3xoNkBYQBqYeRk8yqbxERkXIsUV966aX466+/zJhpJuXgwmlEt2/fft6pPCubnCq56T9TY129KyIiUhnHUdepU+ecTmObN282vcFnzJiBys4rsIa59U1ToBYRkXIuUcvf86lmDdT+GXE6XSIicsEUqMuIf3BNc1s1K16JOURE5IIpUJeRqiHWQB2KRCSkZZXVZkRExMOVqI2aHcaKwk5lYuWXW6IO80pCbHIGQqv66dSIiEjZBurQ0NC/ff3WW28t+V54oqrh5iYMiYhJyUAjBLl6j0RExNMD9UcffVR2e+JpAmsgBVWRggBTohYREbkQaqMuK5GtcXeDeRia8aLJSS0iInIhFKjLUHigtV2aOalFREQuhAJ1GQoL8je3McmZZbkZERHxYArUZWjE0Zcw1/8J+JzcUpabERERD6ZAXYYaZu5HR+99OHhwL9Iys8tyUyIi4qEUqMtQ4KBn8KjfJPyR1giLtkeX5aZERMRDKVCX5cm9qD8iu1+HMwjFN+uPluWmRETEQylQl7ERneuZ2xV7z+BYXGpZb05ERDyMAnVZOrsPDY7Nx/1R22GxAN+pVC0iIiWkQF2W4g4D392Jf8W9iF7e2/HNhqPKpCUiIiWiQF2Wml4OdL4VXrBgqt+7iD97En8ejC3TTYqIiGdRoC5rg14CajRHpFcsXvGbga//PFzmmxQREc/h0kC9fPlyDB06FHXq1IGXlxfmzp37t+9ZtmwZOnfujICAADRr1gwzZ86EW/MPAkb8Dzne/hjgsx7Vtn2C5HTlpxYRkQoQqJOTk9GhQwdMmzatWOsfOHAAQ4YMweWXX45NmzZhwoQJuOOOO7Bo0SK4tdod4NX/KXP3Ua9PsHLV767eIxER8cQ0l8521VVXmaW4pk+fjsaNG+O1114zj1u1aoUVK1bgjTfewMCBA+HOvHqOw6E/f0TD2FVouXIC0HcN4FfV1bslIiJurkK1Ua9evRr9+/fP9xwDNJ8/n/T0dCQkJNiXxMREuIS3NwJGvI/TllA0yDqIxB/+7Zr9EBGRCqVCBero6GhERkbme46PGYBTUwufTGTy5MkIDQ21L61bt4arRNVtgI9qPWruB2/5ENi90GX7IiIiFUOFCtQXYtKkSYiPj7cvO3bscOn+tOg7DP/NGmzuW+aOAxJOuHR/RETEvVWoQB0VFYWTJ0/me46PQ0JCULVq4e297B3O121LcHAwXGlgmyi853sztuU0gldqDDDnnzDTlomIiFT0QN2rVy8sWbIk33OLFy82z1cUVfx8MKhDQ9yfOR4xvpFAz3GAl5erd0tERNyUSwN1UlKSGWbFxTb8ivcPHz5sr7a+9dZb7evffffd2L9/Px555BHs2rUL7777Lr766iv861//QkUysks97LfUwSXpryGhYT/rkyxVf30bsHoakO6iDm8iIuJ2XBqo161bh06dOpmFJk6caO4/+eST5vGJEyfsQZs4NOvHH380pWiOv+YwrQ8++MDth2YV1LF+dTSLqIakTG/8uCW3jTp6K7B9DrDkWQ7myls5M81l+ykiIq7nZbFUrgbSo0ePon79+jhy5Ajq1bOmoHSF93/bh8kLd6Fzg+r4blwfICUG2PoNkHIGuNxh6Nb0iwFvH6B+T6B+d+sS6rr9FhGR8o1FLp3wpDIb3qkuXlm0GxsOx2Hf6SQ0rRUO9Lgr/0rxx6wlbViA4xuBP96zPh9cJy9o1+sO1G4P+Aa45DhERKRsKVC7SERIFVx6US0s3XUKM37bj39e2gT1wgLh7+vQGhFaF3hwF3Dgd+DoWuDIWmvgTjwO7JhrXcjHH4hoBUS1N9OVos11QFANVx2aiIg4kQK1izuVMVB/ue6IWby9gNqhVdGwRqBZGoQHmdt29YagfvuR1jdlJFtL1wzaXBjAU84CJzZbl42fAs365wXqXT8CZ/ZYn4tq68rDFRGRC6BA7UJXto7EzT0aYP2hWBw6m4LUzGwci0s1y6p9Z+3rMYCPv6I57ruiGfyYjatRX+tC7GIQexCI3gKc2AKc+QsIa5S3kc1fADvnAV7eeYE6/iiw5UugdkegTicgMLy8D11ERIpJgdqFfH288cLwduY++/SdTkrH4bMpJmgfOpuMQzEppv1627EEvLVkD5b/dRpTb+iIRjWD8j6EY7DDG1uX1teeu5Fm/axBuoHDWPNDq3N7l+cKbQDU6wI07AM07A3UamXmJhcREddTr+8K4PtNx/CfuduQmJaFQH8fPDW0NUZ1rW9yeF8Qtnmv+9BaVR6z79zXq4YBDXpbg3ajPkBkO8BH13QiIq7o9a1AXUGwOnzil5vwx4EY83hgm0hMvq49woP8S/fBafHIOrYJXkf+gM/hVdZ278zk/Ov4BwPd/g+48pm85+bcYy2pD3zeGtjp5HYg6SQQ3tQ6hIzDykRE5BwanuWB6lavitl39sR/f9+P137ejUXbT2Lj4eWYMrIDLrmo1gV95vbj8fh87WHM3ZiG6oFd8dboO9G5bjVrSfvQSuDQKms1eXo8ELM/741sF98823q//1N5z6//GFj7fl5PdLaVM2jXaAqENwGqRQABIUAVLqFAQKj1vo9fqc6NiIgnU4m6Atp2LB4TvtyEvaeSzOMxvRri+i710CIqGAG+RZdiUzKyMH/zCcxaexibj8Tle83X2wuPDGqBO/o2gTd7sFFONnByG5CVbh23bZ7LsY7pzs4AetwN+OUmRFn2ErDtOyD2gPW14mgxGBj9ed7jBY8AIbWBLrflldRFRDyMqr6ddHLcWWpGNl5auBMfrz6UL9BeFBmMtnVD0K5uKNrUDUXr2iEmEciu6ATM/uMw5mw4hsT0LLO+n48XBrSJMsPEvll/FPNzpzPt1zLClNTDLrRancGdPcvZ/n12HxBzwFoi5zCy9AQgLcF6m5EEtL8BuG6G9X1p8cBLDaz3Hz2YF6jXvIes45txJrApajXpBJ+oNkBwlJKZiEiFpUDtpJNTESzbfQr/W3EAW4/FIy4l85zXfby9EBVSxbRx23Bs9ujuDTCiSz3UrBZg73U+64/DeHb+DmRk5aBOaBW8fVMndGlYhkO3GNBZ8raVyFNjgT//ByQcA65+w75a+odDEXB4eb63ZlUJg29kGyCyNRDRGuB9TvoS4No0piIixaFA7aSTU5Ew0DIYcygX254ZuFlFfiYpw17aHtAmEjd1b4jeTWvkVW0XwPeOn70RB84kmyD/yMAWuPNih6rwcrb1aDw++Oh9NErbhYu8j6Cl1xE08oqGj9d5pqhnx7fAMGv60J73WJ/jPOpr/wuE1AE635K37tF11lI929N9AgDf3BqEjBRrhzpzm2KdZMbcpljHore4Ki9hyu4FgH81oPmVeSV8Ng1oeJuIFEGdySohDtXiFKRcBrWNsgfvU4nppi27eWQ1RARX+dvPaVMnFD/c1xf//m4r5m0+bhKHrNl/Fq+N6lj6HuYl9PP2aDzwxSakZrZBi8ieGHZLF+yOTsAb6/fjyF+b0Mxy2Bq8vY+gne8xhOecBTISrUtmXg0C4o8Ay14EgmvnD9Q/TbLO7FYSve/LC9TJp4FvxlqD/BOn8tb57k5rRzx2oKvRxKFDHZfGeTUIIiLFoMGxHh68I0OqmKUkqgX44s0bO6JX0xp4at52/Lr7NAZNXY4buzfA9Z3romENhwlXygAvMFid/8KCnaaDOXu1T7upE4Kr+KFxzSAMalsbcSndTJv63I3HMPlQLJABhCAZbULTcX+vGujZrm1eslBWh3ceA3gX+HNn0GSJmh3lsjOB7HRrj3b/QMAvKPc2EOBscOY2EGh0cf7P4CQxHKbm6Oxe63zsXA6tOPcAQ+sDtVoAtVo6LC2sPeBFRApQr28p0s4TCbh31gbsP5M3trprwzDTy3xwu9oIrercoVVZ2Tnm4oDt5cQpVp+5po2Zxe18OJvbnI3HzFCz6ARr/u5eTWrgqWtao2WUC4Ifq9rZec50pLN1qNtvvc8Oc+crqQ943no/4QSw4RPrWPRON+f/XFaz26roRaTCUhu1k06OWKVlZmPR9mh8u+EYVuw5jZzc5mFm+hrQOhLXd66Hi5vXLDKYFkdiWibunb3RTJXK5t7HB7fC//VtXOwZ2Dj0bPqyfZi+fL/pEMdm9Zt7NMTEKy+68B7szsTSOoMt52M/vQs4vTvv9or/5AXlgyuAmUOAGs2B+9blvf+9vsDJrdZ2+IBqeaV9Bm9/W8mfj4OspXPbDHOcHpayMqz5zvl8eVW/85jZRMCe/5yT3r4cAFLjrPthq7Hg/VZDgXYjrO/lRc36mdbj44Q7Nsc3WWtCfKtaJ9Uxfx9e1poN+32HW0sOUKW6ddgfZWdZzyNxvnvb3xe/G9as+Nr6LARYP5/HwI6POZlATpZ1HdtjrmdLgMM+C39+YD3HVzyZ10/hl2eA7XMc3ptlfW927i1x/wJr5C5hQN0uQN9/5R3zwZXW77TmRUpp6yHURi1OxeFd13asa5bo+DQzpem3G47ir5NJpvqZC3uPX92+NoZ2qI1O9cNK3PnsSEwK7vxkHXZFJ6Kqnw+m3tgRA9tY29qLK9DfFxMHtMDIrvXx4oKdWLgtGp+uOWTa2hmsWTov7cVEqTAg8Ec9qBfQ0GHudWIwsOGPtRlHXqDHfVruuHdbO3xxXPFEXqA+sxuY3hcIigAe3pO3zrd3WKvr/R2Df6A1EPrlLr5VcoNpFevz7G0fZZ2nHokngbUzrEHt8n/nfe7no4H9y6wd8YqrRrO8+0mngMVPWifGcQzUfO7AbyiRrrfnjSTgBcCMy6z3n4zNC9QLHga2fZP/fQz+DPTn03oYMOrjvHV/ftx6v/f9ecluGLh5YVKU5FPWxYYXVY6BevYo68XJ+PVAzdxztO4j4K9FQHAkUC3q3FtOMKTJhIqP55zny94pNNvh4s+11EYtJRIVWgX/vLQp7rqkCbYfTzDjrxkIzySlY+aqg2bh0K4hJmjXMeO5CysRs/S79kAMVu87azKFbTseb2JVreAAfDimG9rVC73gb6Z+eCDe+0cXrNp3Bs/+sMMEf1ancxz588PbolsjN8wW5niOOMxs6JvnrvPAFusscSz58Uc7I7dHurlv65meBKRzSbCWWJmj3IbPe/kAVavn/9xTO62T2pREnwl5gZrb+n2KdbY5x0DN0iP3iT92IfWAsIbW2epsS1BNa6c/s++p1nWZzc2GFwYdRlt75TtikwBrG7LSrD+msORe6PA2x+F+7i23z4sPG5Z0g+vknnOHCyQLP6uAooI0+zw4fm8siXf8x7lDBHmu+Dzny/f2s77PJ/fW9JuwWL8rzjOQGmP9ftnx0YZ9KHi+EqOtQdjm+Abgr4Xn3z/WKPCij3MOsMTO46vXDRjwXN4q7/aynvvbcwM+7ZgHHFieW7oPt96yFiakLlC9fv5zWZHk5FinOI47ZP2bsE3gxJqNtzpa5354eF9eDcmvLwIr3sitscqtqWJN1p2/lnvwVhu1lFpmdg5+33PazHj2846TSMqdUIUahAeakjbbs5PTs7By31ms3ncGm47EITM7/xCrLg3D8NboTma6VGe2eX/+5xEz7SrHmfv7eGPGrV1wWYsIVEoMXvxhZonZcZgag4Qt4JsLAHayS7Oua27TgCwG09yl3ci8qnqOf+ePGgM1q/BtWKXPQMTOcxWlXZ3nx9axkCUsjvM3AdXHIbj6OVS5uxC/N6a3ZQBnAGLNRlJ07u3Jwi88mg8Abv467/HzUdbvlReBvJCihY8Cf0w//3aDagHVG+QuDa23vLhkEh8b7hPPEy8Ky3rO/5wc68Uil+Qz1nkY4o9Zb81y3PqYnTvZ7EDsFHrb/LzPeK0lkHgCuGNpXg0UR4WseTf/tti09Phxp+y22qiddHLkwtqzl+0+jflbjmPJzlMmx/b5MCD3aVYDvZvWND3MS9o7vSTiUjLw8DdbsHjHSQT4euPD27qhT7OaZbY9EZdi8OLFly1wsyaGFxmsEm/QI2+9w2tyc9W3tzZr0J5fgMOrc0v3Z60lfN6yxMlgWJjGlwBjfij6AmDp89YJjWz9EfItbGqpkjuCwpJ30cSmkCtymxPoqzHWi4DrP7CW7m19AFa8XrzzYqvd4TngZzjWKrHmgBchtgswXrCymcSx1ooXbk0vhzOojVpc2p7NcdxcWL3NYM2gzSFeIVX8zGQr1qUm6odXvfBUnSVUPdAf027qjHGz1uOXnadwx8fr8PHt3dG9sRtWg4uUFqv3q9WyLrYmisI06Hnuc837W5fCsIqeVcdxh61LbO79Oh3zr2crzbNDng2DHoM/l+Kqz4sKh0DN7H4sGbPd3xaobcMa2bGPVfWc2IjV9FxC6+Y+rme9Xy2y8HZ71ggUZDpoOtQ8uZCqvqVc5ORYzIVqeQXm80nPysadn6w3PcuD/H3w6R090LlBWKnHfXNSmcU7T2LV3rO4vGUEbu/TyOXHKuJSBTtjJbN0fsbaF8HWJ8HWlGK7b5f7Hrabtxme9/SO762vNeqb11mPzTLchuNFQQWgqm8nnRzx3Or5sR/9idX7zyK4ii9m39GzxJ3X2Pb958FY/LLzpFkOnc3fs3lwuyi8MqKDmTxGRKQgVX2L/E31/P9u64oxH641wfaWD/8wwbp1naInR4lPzcSKPWeweEe0qcrnYxuOKe/brCaaR1TDhysPYMHWaDN8bfo/uqBZRLVil8yX7zmDXScScGO3BggNLJs83QfPJGPPqSRzEcELFTZJ8LZaFV/4uXL4mogUSlXfUmlxgpVb/rfW9ECvEeSPL+7qieaRwfmq63ecSDAZyn776zQ2HI5Dtm22F85AGuSPK1pGoH+rSDPhS1Bu6XnD4ViM+2yDmSWNwXDKyPZm2tOirNp7BlN+3m22YfvsRwe1wMgu9Z2WEOVEfCreWPyXGVLncBj5cAw7AzbHxdsms2lQwz3a6UQ8SYWr+p42bRpeffVVREdHo0OHDnj77bfRvXvuGLcCZs6cibFjx+Z7LiAgAGlp1qkj/46qvsURS8U3f7DGZB3jGO4Zt3TB4ZgU/Lb7NJbvOW3PPmbTpFYQrmwVif6tI03bNjOMFeZ0YjrGz96APw5YO87cc1lTPDSgxTnrrzsYg9d+/stUw5u/ZV9v0/ud+0Ad6oXimWvbomP9AmOfSyA+JRPv/bYPH608gPQs67jgllHBZlhdYlqWWYrqnd+9UTiu71LXDLHjfOsiUskC9Zdffolbb70V06dPR48ePTB16lR8/fXX2L17NyIiIgoN1A888IB5PV/yiUiHiQCKoEAtBcUmZ2D0f9eYiVEKYoez3s1q4tKLapmFk6mUpB37pYW78MEK66xUrBrnOHGWlrccjTMBmiV14vju0d3r497Lm5npTj9edRBTf9ljH5N+Q9f6eGRQC9TIzR9e3Lb4T1YfxLRf99mr6Rl0Hxvc8pwOdAzaSWlZZnsJaZn462QivuOUsXvP2CdNq+LnbWaLYymbQ9vOd5EiIh4WqBmcu3Xrhnfeecc8zsnJMTt/33334bHHHis0UE+YMAFxcbnTKZaQArUUhjOr3fzfP7D7ZKIpbV7awhqYuzYMN+3PpfHD5uN49NstSMnINmPHW9UONkPEiMFuVNd6GH9F83MmejmVmGYCPQMmhVTxxYMDWvztVKisnv9uw1FTzX083lrTdFFkNTw6qKWpqi9Jb3RWlzPhybfrj2Lf6bzELFEhVXB/v+bm4kK920U8OFBnZGQgMDAQ33zzDYYNG2Z/fsyYMSYQf/89u+KfG6jvuOMO1K1b1wT1zp0748UXX0SbNm2KtU0Failq6FZyenaZ5N1mCfWfn67HgdwsZCyMDutY1wS7RjWLnpJx/aEYPPn9djNlqy3osoMaE49kZFuQkZWdez/H3HIGNuYhp9qhVcw859d1rleqEjB/JrYcjTdzvHPKWG6DBrWJwkvXtzPj1EXEA3t9nzlzBtnZ2edUW/Pxrl27Cn1PixYt8OGHH6J9+/aIj4/HlClT0Lt3b2zfvr3Qg01PTzeLTWJiMZMZSKUT4OtjlrJwUWQwvh/fBy/M34nMnByMu6wpmkUUmBP6PLo0DMe88X1NGk92OGNvci5FYemb1ehjejcyvdxLi6XmDvWrm+XxIa3wyapDeGXRLvy0PRqbj8Zh6g0d0aNJ7hzJIuJUFW6QZ69evcxiwyDdqlUrvP/++3juOYfJ5nNNnjwZzzzzTDnvpci5OAzq5REOSTJKgKXhf/RsiCHtamPBthPIyraYKnm2bZvbAvdZfV9WHb94MXPnJU3MtK/3fb7R1BKwjX/85c1MDYFLM5SJeCCXBuqaNWvCx8cHJ0+ezPc8H0dFFS/FoZ+fHzp16oS9e/cW+vqkSZMwceJE++Njx46hdevWpdxzEddgRzPm2HYHbeuGYv59ffH0vO34ev1RvLV0r0m68uaNHVEvLLBY1elq3xZx80Dt7++PLl26YMmSJfY2arY78/H48eOL9RmsOt+6dSsGDx5c6OscusXFJiHhPJPKi0iJcez4qyM74OKLauHx77Zi/aFYXPXm75h8XTtc3b6Ome/94JkUU+o+cCYJ+81tspl0JTYl04wzDwrwMbfVOPEKb81zvggL9DPTsfZsUkM9zKVSc3nVN0u77DzWtWtXM3aaw7OSk5PtY6U5dIsdx1iFTc8++yx69uyJZs2amQ5nHH996NAh08FMRFzjmg510Kl+ddz/xUZsPByH8bM34ul5O0xv+qJwOBiXkyh8PQ5tYw/zazvVwfBOddEyqujZ40Q8kcsD9Q033IDTp0/jySefNBOedOzYET/99JO9g9nhw4fhzUwwuWJjY3HnnXeadcPCwkyJfNWqVarOFnExjjH/6p+98OYvezBt2V57kGbJuHHNIDSuWc1MGMP7jWoEoWawP1LSs+3B2jaOOzE9y+QuZ6l74bZoM8Pb+7/tN0ur2iEY3qkOru1Yt0zTohbHsbhULN150uxvoJ8PAv19UdWftz65t77mfkRwgNv2imfzw5GYVHMsnRpUd0rHQ3E+l4+jLm8aniVS9g6dTTazujWpGWTa1UszZO7XXacxZ+NRLN11CpnZ1p8rjjRjqlQG/WyLxUz3mmOxIDvHGnzMcxYgMjjAVJ8znakz5jE/fDYFC7edwIJt0dh8pHhzOXDYOmsb+rWKRL9WEWgRGeyytnkO39t+PN40Uaw7GIv1h2PNLHrUtFYQpozsgE6lzCYnHjaO2hUUqEUqpriUDPy49QTmbDiGdYdiS/ReJh3hBDacl/2yFrVKVMLdfzrJlOwZoDnVrA1jbbeG4WYu9NSMbNMez0ltOB2ruc19jm3xjjixDSeeuaJVBHo1qVHmpVheUHAY3fqDsWYonW0aWRs/Hy+zD5xKlhdAd1/aFA/0b15mQxXLYmbBDYdjzcUH56kf27uxqdFwdwrUTjo5IuKeWLJdtD3aVDv7eHmZAMPkJd5eXmDBmbcste48kYBfd53C2eSMfEPdujQMQ/9WEejaKNwE1NiUDBNQ45Jzb1MyEJOSgaOxqSbXuA23w85tV7WrjYFtIhER/PfV78fjUvHr7lNYuvOUmZLVMVAyCUrf5jVxa6+GZopZZ5W0Wf5a9tdpTF+2zz7fvA2bInj8HJ/ftVEY2tUNNdPNsvf+3E3HzTos9b82qoPp2e9OLBaL6ZDIi451h2JMcHacMY94PDNu7YLaofln+nM3CtROOjkiUvFxSlVmSFuy8ySW7DxlpoktCV9vLzPf+1Vto0xGsZLMt14QLwpW7TuDJbusgZvt7zZt64aY0uxVbWtfcC93Vm1zytoZy/fbj5P7zwuLi5vVRJdGYaY54nwXBD9ti8bjc7aaCxvuAyfN4fj40k6jWxJsxuDMesfiUsyF0tFYaxs6b7cejTunhsJWbc8qezaPxCRnmOxv79/S2VyMuCsFaiedHBHxPEdiUqxBe9cpM7UrJ6IJC/RH9cDc2yDrLUue4UEB6NYorEw6g7F0yDSqX687ii/+PIy0TGtJu2GNQNx1SROT/KS41eLshPfF2sP434oDOJE7vzsTytzUowFu79u4RKXLs0npZspaNjNQ69ohpu367/K1XygmjJm/5TgWbT9p+jaciEsz0+GeT4CvNzrUq24uOro0CEPnhmH2aX/53d75yTqTYIcTAD0/vC1Gda1f7O+DWeuYRY8dAcuaArWTTo6ISHlgKZAZ0z5efdA+jzpLhWP7NDIz0oVW9TPZ2LgeS5tM2MJOYKcS0nE8PhU/bjmBhLQs+/tu79vITIzD910oBs8n5m4zJVi2Y196UQTqhVVFnepVUKc6b6ua9vZa1QJKnDOdtRxsBmBudDZhsCbAEUvzHJZXN6wq6lWvarbL+5yKt02d0CJL+Bwx8OBXm027PN3epzH+PbjleWfMY7Y49nuY9cchMzUvx/Jf36WeOe+cU7+sKFA76eSIiJQndj77Yu0RfPD7fnvmMw7xYgkvJjnd9GQ/H1ZpsyQ+rFNdp3VQ48UAq8J/3pF/9khHrFqPCq1igimH3TWsweF3geaWtQOcvMaG7f1M7MLsbicT8sbOs02cOc871g8zAZm99UszFW1OjgVvLd1jUsXSxc1r4p3RnREamHfhwlSzs9YcNklmbPnY2SLg2L26d9MauKVnQ5N/3hmjBhwpUDvp5IiIuALzg7Otefpv+/IlYGHBlW3kHJvNhdW07NDGZCn9WkaUuGRb3CrhPw/GmjZvdozLW9JMGztLx0XhPjYMDzTHtPlovP15NjUwg9yILvXQpk5ImQxZW7j1BCZ+tdkEYl48MB/8juMJmPXHYWw9lrcvzEjHEvS1HeqanvGfrjlkmkdshxYZEoCbujc0aV0jnDR+X4HaSSdHRMSVWDLcdjze9GKPCAlAjaAAt5pOldXxp5PScSw2FUdiU8x0sWznPXg2GYfOppiqekfc98tb1DLBmePby2MI2I7jCabdmh3SHLENe3C7KBOg2Qu+4IXC0dgUk7GONRy2UQOsPRjYNgpPDGltahFKQ4HaSSdHRERK11GMQ+kYuNnjncGZJezydjYpHffM2oC1B2JMdfzNPRpgRJf6xco9z0l32Bv+09WHzPh9zkX/x7/75avS9+h81CIi4rnYma1dvVCzuFKNagH4/M6eJjFMk5rVStREwFI/p6zlwtL53tNJpQ7SJaVALSIiHs/H2wvNIoJL9RkcolZWw9SKogzvIiIibkyBWkRExI0pUIuIiLgxBWoRERE3pkAtIiLixipdr++cHOucsidOWCecFxERKW+2GGSLSUWpdIH65EnrnLXdu3d39a6IiEgld/LkSTRo0KDIdbwsnMi1EsnKysLGjRsRGRkJb+/S1fwnJiaidevW2LFjB4KDSzc+T6Qi0d++VEaJTvzNZ0maQbpTp07w9S26zFzpArUzJSQkIDQ0FPHx8QgJKf9B8CKuor99qYwSXPSbr85kIiIibkyBWkRExI0pUJdCQEAAnnrqKXMrUpnob18qowAX/earjVpERMSNqUQtIiLixhSoRURE3JgCtYiIiBtToC6FadOmoVGjRqhSpQp69OiBtWvXOu+bEXFDy5cvx9ChQ1GnTh14eXlh7ty5rt4lkTI3efJkdOvWzUxyEhERgWHDhmH37t0oLwrUF+jLL7/ExIkTTQ/ADRs2oEOHDhg4cCBOnTrl3G9IxI0kJyebv3VepIpUFr/99hvuvfderFmzBosXL0ZmZiYGDBhg/j+UB/X6vkAsQfMK65133rFPB1e/fn3cd999eOyxx5z5HYm4JZao58yZY0oXIpXJ6dOnTcmaAfySSy4p8+2pRH0BMjIysH79evTv3z/vRHp7m8erV6925vcjIiJuhlOIUnh4eLlsT4H6Apw5cwbZ2dkmsYcjPo6OjnbWdyMiIm6GtacTJkxAnz590LZt23LZZqVLcykiInKh2Fa9bds2rFixAuVFgfoC1KxZEz4+Pvbc1jZ8HBUV5azvRkRE3Mj48eMxf/58M/qhXr165bZdVX1fAH9/f3Tp0gVLlizJVx3Cx7169XLm9yMiIi7GbNAM0uw8uXTpUjRu3Lhct68S9QXi0KwxY8aga9eu6N69O6ZOnWq66o8dO9a535CIG0lKSsLevXvtjw8cOIBNmzaZTjUNGjRw6b6JlGV19+zZs/H999+bsdS2vkjMTV21alWUNQ3PKgUOzXr11VfNl9axY0e89dZbZtiWiKdatmwZLr/88nOe50XrzJkzXbJPIuUxFLEwH330EW677bay376FZXoRERFxS2qjFhERcWMK1CIiIm5MgVpERMSNKVCLiIi4MQVqERERN6ZALSIi4sYUqEVERNyYArWIiIgbU6AWkTKd0Wnu3Lk6wyKloEAt4qE4tSEDZcFl0KBBrt41ESkBJeUQ8WAMypyP2FFAQIDL9kdESk4lahEPxqDMHOmOS1hYmHmNpev33nsPV111lckA1KRJE3zzzTf53r9161ZcccUV5vUaNWrgrrvuMhm0HH344Ydo06aN2Vbt2rVNOkBHZ86cwfDhwxEYGIjmzZtj3rx59tdiY2Nx8803o1atWmYbfL3ghYVIZadALVKJPfHEE7j++uuxefNmEzBvvPFG7Ny507zGtK0DBw40gf3PP//E119/jV9++SVfIGagZwpABnAGdQbhZs2a5dvGM888g1GjRmHLli0YPHiw2U5MTIx9+zt27MDChQvNdvl5NWvWLOezIOLmmD1LRDzPmDFjLD4+PpagoKB8ywsvvGBe53//u+++O997evToYbnnnnvM/RkzZljCwsIsSUlJ9td//PFHi7e3tyU6Oto8rlOnjuXxxx8/7z5wG//5z3/sj/lZfG7hwoXm8dChQy1jx4518pGLeBa1UYt4MOaOZinVUXh4uP1+r1698r3Gx5s2bTL3WcLt0KEDgoKC7K/36dMHOTk52L17t6k6P378OPr161fkPrRv395+n58VEhKCU6dOmcf33HOPKdFv2LABAwYMwLBhw9C7d+9SHrWIZ1GgFvFgDIwFq6KdhW3KxeHn55fvMQM8gz2xffzQoUNYsGABFi9ebII+q9KnTJlSJvssUhGpjVqkEluzZs05j1u1amXu85Zt12yrtlm5ciW8vb3RokULBAcHo1GjRliyZEmp9oEdycaMGYPPPvsMU6dOxYwZM0r1eSKeRiVqEQ+Wnp6O6OjofM/5+vraO2yxg1jXrl3Rt29fzJo1C2vXrsX//vc/8xo7fT311FMmiD799NM4ffo07rvvPtxyyy2IjIw06/D5u+++GxEREaZ0nJiYaII51yuOJ598El26dDG9xrmv8+fPt18oiIiVArWIB/vpp5/MkClHLA3v2rXL3iP7iy++wLhx48x6n3/+OVq3bm1e43CqRYsW4YEHHkC3bt3MY7Ynv/766/bPYhBPS0vDG2+8gYceeshcAIwYMaLY++fv749Jkybh4MGDpir94osvNvsjInm82KPM4bGIVBJsK54zZ47pwCUi7ktt1CIiIm5MgVpERMSNqY1apJJSq5dIxaAStYiIiBtToBYREXFjCtQiIiJuTIFaRETEjSlQi4iIuDEFahERETemQC0iIuLGFKhFRETcmAK1iIgI3Nf/AxKbbl0A2sA/AAAAAElFTkSuQmCC",
      "text/plain": [
       "<Figure size 500x300 with 2 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from previous_chapters import plot_losses\n",
    "# Alternatively:\n",
    "# from llms_from_scratch.ch05 import plot_losses\n",
    "\n",
    "epochs_tensor = torch.linspace(0, num_epochs, len(train_losses))\n",
    "plot_losses(epochs_tensor, tokens_seen, train_losses, val_losses)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6777e0c4-d82c-46d8-84fb-1376c4f8bae0",
   "metadata": {
    "id": "6777e0c4-d82c-46d8-84fb-1376c4f8bae0"
   },
   "source": [
    "- As we can see, the loss decreases sharply at the beginning of the first epoch, which means the model starts learning quickly\n",
    "- We can see that slight overfitting sets in at around 1 training epoch"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "87b79a47-13f9-4d1f-87b1-3339bafaf2a3",
   "metadata": {
    "id": "87b79a47-13f9-4d1f-87b1-3339bafaf2a3"
   },
   "source": [
    "## 7.7 Extracting and saving responses"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5a25cc88-1758-4dd0-b8bf-c044cbf2dd49",
   "metadata": {
    "id": "5a25cc88-1758-4dd0-b8bf-c044cbf2dd49"
   },
   "source": [
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/18.webp?1\" width=500px>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "17510e9d-7727-4d58-ba9a-d82ec23c1427",
   "metadata": {
    "id": "17510e9d-7727-4d58-ba9a-d82ec23c1427"
   },
   "source": [
    "- In this section, we save the test set responses for scoring in the next section\n",
    "- We also save a copy of the model for future use\n",
    "- But first, let's take a brief look at the responses generated by the finetuned model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "VQ2NZMbfucAc",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "VQ2NZMbfucAc",
    "outputId": "066c56ff-b52a-4ee6-eae7-1bddfc74d0c1"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
      "\n",
      "### Instruction:\n",
      "Rewrite the sentence using a simile.\n",
      "\n",
      "### Input:\n",
      "The car is very fast.\n",
      "\n",
      "Correct response:\n",
      ">> The car is as fast as lightning.\n",
      "\n",
      "Model response:\n",
      ">> The car is as fast as a bullet.\n",
      "-------------------------------------\n",
      "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
      "\n",
      "### Instruction:\n",
      "What type of cloud is typically associated with thunderstorms?\n",
      "\n",
      "Correct response:\n",
      ">> The type of cloud typically associated with thunderstorms is cumulonimbus.\n",
      "\n",
      "Model response:\n",
      ">> The type of cloud associated with thunderstorms is a cumulus cloud.\n",
      "-------------------------------------\n",
      "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
      "\n",
      "### Instruction:\n",
      "Name the author of 'Pride and Prejudice'.\n",
      "\n",
      "Correct response:\n",
      ">> Jane Austen.\n",
      "\n",
      "Model response:\n",
      ">> The author of 'Pride and Prejudice' is Jane Austen.\n",
      "-------------------------------------\n"
     ]
    }
   ],
   "source": [
    "torch.manual_seed(123)\n",
    "\n",
    "\n",
    "for entry in test_data[:3]:\n",
    "\n",
    "    input_text = format_input(entry)\n",
    "\n",
    "    token_ids = generate(\n",
    "        model=model,\n",
    "        idx=text_to_token_ids(input_text, tokenizer).to(device),\n",
    "        max_new_tokens=256,\n",
    "        context_size=BASE_CONFIG[\"context_length\"],\n",
    "        eos_id=50256\n",
    "    )\n",
    "    generated_text = token_ids_to_text(token_ids, tokenizer)\n",
    "    response_text = (\n",
    "        generated_text[len(input_text):]\n",
    "        .replace(\"### Response:\", \"\")\n",
    "        .strip()\n",
    ")\n",
    "\n",
    "    print(input_text)\n",
    "    print(f\"\\nCorrect response:\\n>> {entry['output']}\")\n",
    "    print(f\"\\nModel response:\\n>> {response_text.strip()}\")\n",
    "    print(\"-------------------------------------\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "49ab64c1-586f-4939-8def-23feeb1b3599",
   "metadata": {
    "id": "49ab64c1-586f-4939-8def-23feeb1b3599"
   },
   "source": [
    "- As we can see based on the test set instructions, given responses, and the model's responses, the model performs relatively well\n",
    "- The answers to the first and last instructions are clearly correct\n",
    "- The second answer is close; the model answers with \"cumulus cloud\" instead of \"cumulonimbus\" (however, note that cumulus clouds can develop into cumulonimbus clouds, which are capable of producing thunderstorms)\n",
    "- Most importantly, we can see that model evaluation is not as straightforward as in the previous chapter, where we just had to calculate the percentage of correct spam/non-spam class labels to obtain the classification accuracy\n",
    "- In practice, instruction-finetuned LLMs such as chatbots are evaluated via multiple approaches\n",
    "  - short-answer and multiple choice benchmarks such as MMLU (\"Measuring Massive Multitask Language Understanding\", [https://arxiv.org/abs/2009.03300](https://arxiv.org/abs/2009.03300)), which test the knowledge of a model\n",
    "  - human preference comparison to other LLMs, such as LMSYS chatbot arena ([https://arena.lmsys.org](https://arena.lmsys.org))\n",
    "  - automated conversational benchmarks, where another LLM like GPT-4 is used to evaluate the responses, such as AlpacaEval ([https://tatsu-lab.github.io/alpaca_eval/](https://tatsu-lab.github.io/alpaca_eval/))\n",
    "\n",
    "- In the next section, we will use an approach similar to AlpacaEval and use another LLM to evaluate the responses of our model; however, we will use our own test set instead of using a publicly available benchmark dataset\n",
    "- For this, we add the model response to the `test_data` dictionary and save it as a `\"instruction-data-with-response.json\"` file for record-keeping so that we can load and analyze it in separate Python sessions if needed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "-PNGKzY4snKP",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "-PNGKzY4snKP",
    "outputId": "37b22a62-9860-40b7-c46f-b297782b944c"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|█████████████████████████████████████████| 110/110 [01:08<00:00,  1.60it/s]\n"
     ]
    }
   ],
   "source": [
    "from tqdm import tqdm\n",
    "\n",
    "for i, entry in tqdm(enumerate(test_data), total=len(test_data)):\n",
    "\n",
    "    input_text = format_input(entry)\n",
    "\n",
    "    token_ids = generate(\n",
    "        model=model,\n",
    "        idx=text_to_token_ids(input_text, tokenizer).to(device),\n",
    "        max_new_tokens=256,\n",
    "        context_size=BASE_CONFIG[\"context_length\"],\n",
    "        eos_id=50256\n",
    "    )\n",
    "    generated_text = token_ids_to_text(token_ids, tokenizer)\n",
    "    response_text = generated_text[len(input_text):].replace(\"### Response:\", \"\").strip()\n",
    "\n",
    "    test_data[i][\"model_response\"] = response_text\n",
    "\n",
    "\n",
    "with open(\"instruction-data-with-response.json\", \"w\") as file:\n",
    "    json.dump(test_data, file, indent=4)  # \"indent\" for pretty-printing"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "228d6fa7-d162-44c3-bef1-4013c027b155",
   "metadata": {
    "id": "228d6fa7-d162-44c3-bef1-4013c027b155"
   },
   "source": [
    "- Let's double-check one of the entries to see whether the responses have been added to the `test_data` dictionary correctly"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "u-AvCCMTnPSE",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "u-AvCCMTnPSE",
    "outputId": "7bcd9600-1446-4829-b773-5259b13d256a"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'instruction': 'Rewrite the sentence using a simile.', 'input': 'The car is very fast.', 'output': 'The car is as fast as lightning.', 'model_response': 'The car is as fast as a bullet.'}\n"
     ]
    }
   ],
   "source": [
    "print(test_data[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c1b2f3f6-8569-405a-9db6-d47cba65608a",
   "metadata": {
    "id": "c1b2f3f6-8569-405a-9db6-d47cba65608a"
   },
   "source": [
    "- Finally, we also save the model in case we want to reuse it in the future"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "8cBU0iHmVfOI",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "8cBU0iHmVfOI",
    "outputId": "135849ed-9acd-43a2-f438-053d07dae9b2",
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Model saved as gpt2-medium355M-sft.pth\n"
     ]
    }
   ],
   "source": [
    "import re\n",
    "\n",
    "\n",
    "file_name = f\"{re.sub(r'[ ()]', '', CHOOSE_MODEL) }-sft.pth\"\n",
    "torch.save(model.state_dict(), file_name)\n",
    "print(f\"Model saved as {file_name}\")\n",
    "\n",
    "# Load model via\n",
    "# model.load_state_dict(torch.load(\"gpt2-medium355M-sft.pth\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "obgoGI89dgPm",
   "metadata": {
    "id": "obgoGI89dgPm"
   },
   "source": [
    "## 7.8 Evaluating the finetuned LLM"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "805b9d30-7336-499f-abb5-4a21be3129f5",
   "metadata": {
    "id": "805b9d30-7336-499f-abb5-4a21be3129f5"
   },
   "source": [
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/19.webp?1\" width=500px>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "68d2b9d3-b6ff-4533-a89d-7b66079b4fd1",
   "metadata": {
    "id": "68d2b9d3-b6ff-4533-a89d-7b66079b4fd1"
   },
   "source": [
    "- In this section, we automate the response evaluation of the finetuned LLM using another, larger LLM\n",
    "- In particular, we use an instruction-finetuned 8-billion-parameter Llama 3 model by Meta AI that can be run locally via ollama ([https://ollama.com](https://ollama.com))\n",
    "- (Alternatively, if you prefer using a more capable LLM like GPT-4 via the OpenAI API, please see the [llm-instruction-eval-openai.ipynb](../03_model-evaluation/llm-instruction-eval-openai.ipynb) notebook)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ea427a30-36ba-44e3-bb1f-eb0d7008d6e9",
   "metadata": {
    "id": "ea427a30-36ba-44e3-bb1f-eb0d7008d6e9"
   },
   "source": [
    "- Ollama is an application to run LLMs efficiently\n",
    "- It is a wrapper around llama.cpp ([https://github.com/ggerganov/llama.cpp](https://github.com/ggerganov/llama.cpp)), which implements LLMs in pure C/C++ to maximize efficiency\n",
    "- Note that it is a tool for using LLMs to generate text (inference), not training or finetuning LLMs\n",
    "- Before running the code below, install ollama by visiting [https://ollama.com](https://ollama.com) and following the instructions (for instance, clicking on the \"Download\" button and downloading the ollama application for your operating system)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "267cd444-3156-46ad-8243-f9e7a55e66e7",
   "metadata": {},
   "source": [
    "- For macOS and Windows users, click on the ollama application you downloaded; if it prompts you to install the command line usage, say \"yes\"\n",
    "- Linux users can use the installation command provided on the ollama website\n",
    "\n",
    "- In general, before we can use ollama from the command line, we have to either start the ollama application or run `ollama serve` in a separate terminal\n",
    "\n",
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/20.webp?1\" width=700px>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "30266e32-63c4-4f6c-8be3-c99e05ed05b7",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "**Note**:\n",
    "\n",
    "- When running `ollama serve` in the terminal, as described above, you may encounter an error message saying `Error: listen tcp 127.0.0.1:11434: bind: address already in use`\n",
    "- If that's the case, try use the command `OLLAMA_HOST=127.0.0.1:11435 ollama serve` (and if this address is also in use, try to increment the numbers by one until you find an address not in use\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "747a2fc7-282d-47ec-a987-ed0a23ed6822",
   "metadata": {
    "id": "747a2fc7-282d-47ec-a987-ed0a23ed6822"
   },
   "source": [
    "- With the ollama application or `ollama serve` running in a different terminal, on the command line, execute the following command to try out the 8-billion-parameter Llama 3 model (the model, which takes up 4.7 GB of storage space, will be automatically downloaded the first time you execute this command)\n",
    "\n",
    "```bash\n",
    "# 8B model\n",
    "ollama run llama3\n",
    "```\n",
    "\n",
    "\n",
    "The output looks like as follows\n",
    "\n",
    "```\n",
    "$ ollama run llama3\n",
    "pulling manifest\n",
    "pulling 6a0746a1ec1a... 100% ▕████████████████▏ 4.7 GB\n",
    "pulling 4fa551d4f938... 100% ▕████████████████▏  12 KB\n",
    "pulling 8ab4849b038c... 100% ▕████████████████▏  254 B\n",
    "pulling 577073ffcc6c... 100% ▕████████████████▏  110 B\n",
    "pulling 3f8eb4da87fa... 100% ▕████████████████▏  485 B\n",
    "verifying sha256 digest\n",
    "writing manifest\n",
    "removing any unused layers\n",
    "success\n",
    "```\n",
    "\n",
    "- Note that `llama3` refers to the instruction finetuned 8-billion-parameter Llama 3 model\n",
    "\n",
    "- Using ollama with the `\"llama3\"` model (a 8B parameter model) requires 16 GB of RAM; if this is not supported by your machine, you can try the smaller model, such as the 3.8B parameter phi-3 model by setting `model = \"phi-3\"`, which only requires 8 GB of RAM\n",
    "\n",
    "- Alternatively, you can also use the larger 70-billion-parameter Llama 3 model, if your machine supports it, by replacing `llama3` with `llama3:70b`\n",
    "\n",
    "- After the download has been completed, you will see a command line prompt that allows you to chat with the model\n",
    "\n",
    "- Try a prompt like \"What do llamas eat?\", which should return an output similar to the following\n",
    "\n",
    "```\n",
    ">>> What do llamas eat?\n",
    "Llamas are ruminant animals, which means they have a four-chambered\n",
    "stomach and eat plants that are high in fiber. In the wild, llamas\n",
    "typically feed on:\n",
    "1. Grasses: They love to graze on various types of grasses, including tall\n",
    "grasses, wheat, oats, and barley.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7b7b341c-ba0e-40bb-a52c-cb328bbd1fe4",
   "metadata": {
    "id": "7b7b341c-ba0e-40bb-a52c-cb328bbd1fe4"
   },
   "source": [
    "- You can end this session using the input `/bye`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "faaf3e02-8ca0-4edf-be23-60625a5b14e3",
   "metadata": {
    "id": "faaf3e02-8ca0-4edf-be23-60625a5b14e3"
   },
   "source": [
    "- The following code checks whether the ollama session is running correctly before proceeding to use ollama to evaluate the test set responses we generated in the previous section"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "026e8570-071e-48a2-aa38-64d7be35f288",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 193
    },
    "id": "026e8570-071e-48a2-aa38-64d7be35f288",
    "outputId": "e30d3533-e1f5-4aa9-b24f-33273fc7b30e"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Ollama running: True\n"
     ]
    }
   ],
   "source": [
    "import psutil\n",
    "\n",
    "def check_if_running(process_name):\n",
    "    running = False\n",
    "    for proc in psutil.process_iter([\"name\"]):\n",
    "        if process_name in proc.info[\"name\"]:\n",
    "            running = True\n",
    "            break\n",
    "    return running\n",
    "\n",
    "ollama_running = check_if_running(\"ollama\")\n",
    "\n",
    "if not ollama_running:\n",
    "    raise RuntimeError(\"Ollama not running. Launch ollama before proceeding.\")\n",
    "print(\"Ollama running:\", check_if_running(\"ollama\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "id": "723c9b00-e3cd-4092-83c3-6e48b5cf65b0",
   "metadata": {
    "id": "723c9b00-e3cd-4092-83c3-6e48b5cf65b0"
   },
   "outputs": [],
   "source": [
    "# This cell is optional; it allows you to restart the notebook\n",
    "# and only run section 7.7 without rerunning any of the previous code\n",
    "import json\n",
    "from tqdm import tqdm\n",
    "\n",
    "file_path = \"instruction-data-with-response.json\"\n",
    "\n",
    "with open(file_path, \"r\") as file:\n",
    "    test_data = json.load(file)\n",
    "\n",
    "\n",
    "def format_input(entry):\n",
    "    instruction_text = (\n",
    "        f\"Below is an instruction that describes a task. \"\n",
    "        f\"Write a response that appropriately completes the request.\"\n",
    "        f\"\\n\\n### Instruction:\\n{entry['instruction']}\"\n",
    "    )\n",
    "\n",
    "    input_text = f\"\\n\\n### Input:\\n{entry['input']}\" if entry[\"input\"] else \"\"\n",
    "\n",
    "    return instruction_text + input_text"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b3464705-d026-4594-977f-fb357e51c3a9",
   "metadata": {
    "id": "b3464705-d026-4594-977f-fb357e51c3a9"
   },
   "source": [
    "- Now, an alternative way to the `ollama run` command we used earlier to interact with the model is via its REST API in Python via the following function\n",
    "- Before you run the next cells in this notebook, make sure that ollama is still running (the previous code cells should print `\"Ollama running: True\"`)\n",
    "- Next, run the following code cell to query the model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "e3ae0e10-2b28-42ce-8ea2-d9366a58088f",
   "metadata": {
    "id": "e3ae0e10-2b28-42ce-8ea2-d9366a58088f",
    "outputId": "cc43acb3-8216-43cf-c77d-71d4089dc96c"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Llamas are herbivores, which means they primarily feed on plant-based foods. Their diet typically consists of:\n",
      "\n",
      "1. Grasses: Llamas love to graze on various types of grasses, including tall grasses, short grasses, and even weeds.\n",
      "2. Hay: High-quality hay, such as alfalfa or timothy hay, is a staple in a llama's diet. They enjoy the sweet taste and texture of fresh hay.\n",
      "3. Grains: Llamas may receive grains like oats, barley, or corn as part of their daily ration. However, it's essential to provide these grains in moderation, as they can be high in calories.\n",
      "4. Fruits and vegetables: Llamas enjoy a variety of fruits and veggies, such as apples, carrots, sweet potatoes, and leafy greens like kale or spinach.\n",
      "5. Minerals: Llamas require access to mineral supplements, which help maintain their overall health and well-being.\n",
      "\n",
      "In the wild, llamas might also eat:\n",
      "\n",
      "1. Leaves: They'll munch on leaves from trees and shrubs, including plants like willow, alder, and birch.\n",
      "2. Bark: In some cases, llamas may eat the bark of certain trees, like aspen or cottonwood.\n",
      "3. Mosses and lichens: These non-vascular plants can be a tasty snack for llamas.\n",
      "\n",
      "In captivity, llama owners typically provide a balanced diet that includes a mix of hay, grains, and fruits/vegetables. It's essential to consult with a veterinarian or experienced llama breeder to determine the best feeding plan for your llama.\n"
     ]
    }
   ],
   "source": [
    "import requests  # noqa: F811\n",
    "# import urllib.request\n",
    "\n",
    "def query_model(\n",
    "    prompt,\n",
    "    model=\"llama3\",\n",
    "    # If you used OLLAMA_HOST=127.0.0.1:11435 ollama serve\n",
    "    # update the address from 11434 to 11435\n",
    "    url=\"http://localhost:11434/api/chat\"\n",
    "):\n",
    "    # Create the data payload as a dictionary\n",
    "    data = {\n",
    "        \"model\": model,\n",
    "        \"messages\": [\n",
    "            {\"role\": \"user\", \"content\": prompt}\n",
    "        ],\n",
    "        \"options\": {     # Settings below are required for deterministic responses\n",
    "            \"seed\": 123,\n",
    "            \"temperature\": 0,\n",
    "            \"num_ctx\": 2048\n",
    "        }\n",
    "    }\n",
    "\n",
    "    \n",
    "    \"\"\"\n",
    "    # Convert the dictionary to a JSON formatted string and encode it to bytes\n",
    "    payload = json.dumps(data).encode(\"utf-8\")\n",
    "\n",
    "    # Create a request object, setting the method to POST and adding necessary headers\n",
    "    request = urllib.request.Request(\n",
    "        url,\n",
    "        data=payload,\n",
    "        method=\"POST\"\n",
    "    )\n",
    "    request.add_header(\"Content-Type\", \"application/json\")\n",
    "\n",
    "    # Send the request and capture the response\n",
    "    response_data = \"\"\n",
    "    with urllib.request.urlopen(request) as response:\n",
    "        # Read and decode the response\n",
    "        while True:\n",
    "            line = response.readline().decode(\"utf-8\")\n",
    "            if not line:\n",
    "                break\n",
    "            response_json = json.loads(line)\n",
    "            response_data += response_json[\"message\"][\"content\"]\n",
    "\n",
    "    return response_data\n",
    "    \"\"\"\n",
    "\n",
    "    # The book originally used the commented-out above, which is based\n",
    "    # on urllib. It works generally fine, but some readers reported\n",
    "    # issues with using urlib when using a (company) VPN.\n",
    "    # The code below uses the requests library, which doesn't seem\n",
    "    # to have these issues.\n",
    "\n",
    "    # Send the POST request\n",
    "    with requests.post(url, json=data, stream=True, timeout=30) as r:\n",
    "        r.raise_for_status()\n",
    "        response_data = \"\"\n",
    "        for line in r.iter_lines(decode_unicode=True):\n",
    "            if not line:\n",
    "                continue\n",
    "            response_json = json.loads(line)\n",
    "            if \"message\" in response_json:\n",
    "                response_data += response_json[\"message\"][\"content\"]\n",
    "\n",
    "    return response_data\n",
    "\n",
    "\n",
    "model = \"llama3\"\n",
    "result = query_model(\"What do Llamas eat?\", model)\n",
    "print(result)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb6ec392-6d03-4a65-951c-39b92f8df2d8",
   "metadata": {},
   "source": [
    "- Note that if you are getting an `HTTPError: 404 Client Error: Not Found for url: http://localhost:11434/api/chat` error, this could mean you haven't downloaded the `llama3` model yet (to download the model, either use the UI or `ollama run llama3` on the terminal)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "207ae28f-0f8c-4fda-aeef-e7e3046249cc",
   "metadata": {
    "id": "207ae28f-0f8c-4fda-aeef-e7e3046249cc"
   },
   "source": [
    "- Now, using the `query_model` function we defined above, we can evaluate the responses of our finetuned model; let's try it out on the first 3 test set responses we looked at in a previous section"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "id": "86b839d4-064d-4178-b2d7-01691b452e5e",
   "metadata": {
    "id": "86b839d4-064d-4178-b2d7-01691b452e5e",
    "outputId": "1c755ee1-bded-4450-9b84-1466724f389a"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Dataset response:\n",
      ">> The car is as fast as lightning.\n",
      "\n",
      "Model response:\n",
      ">> The car is as fast as a bullet.\n",
      "\n",
      "Score:\n",
      ">> I'd rate the model response \"The car is as fast as a bullet.\" an 85 out of 100.\n",
      "\n",
      "Here's why:\n",
      "\n",
      "* The response uses a simile correctly, comparing the speed of the car to something else (in this case, a bullet).\n",
      "* The comparison is relevant and makes sense, as bullets are known for their high velocity.\n",
      "* The phrase \"as fast as\" is used correctly to introduce the simile.\n",
      "\n",
      "The only reason I wouldn't give it a perfect score is that some people might find the comparison slightly less vivid or evocative than others. For example, comparing something to lightning (as in the original response) can be more dramatic and attention-grabbing. However, \"as fast as a bullet\" is still a strong and effective simile that effectively conveys the idea of the car's speed.\n",
      "\n",
      "Overall, I think the model did a great job!\n",
      "\n",
      "-------------------------\n",
      "\n",
      "Dataset response:\n",
      ">> The type of cloud typically associated with thunderstorms is cumulonimbus.\n",
      "\n",
      "Model response:\n",
      ">> The type of cloud associated with thunderstorms is a cumulus cloud.\n",
      "\n",
      "Score:\n",
      ">> I'd score this model response as 40 out of 100.\n",
      "\n",
      "Here's why:\n",
      "\n",
      "* The model correctly identifies that thunderstorms are related to clouds (correctly identifying the type of phenomenon).\n",
      "* However, it incorrectly specifies the type of cloud associated with thunderstorms. Cumulus clouds are not typically associated with thunderstorms; cumulonimbus clouds are.\n",
      "* The response lacks precision and accuracy in its description.\n",
      "\n",
      "Overall, while the model attempts to address the instruction, it provides an incorrect answer, which is a significant error.\n",
      "\n",
      "-------------------------\n",
      "\n",
      "Dataset response:\n",
      ">> Jane Austen.\n",
      "\n",
      "Model response:\n",
      ">> The author of 'Pride and Prejudice' is Jane Austen.\n",
      "\n",
      "Score:\n",
      ">> I'd rate my own response as 95 out of 100. Here's why:\n",
      "\n",
      "* The response accurately answers the question by naming the author of 'Pride and Prejudice' as Jane Austen.\n",
      "* The response is concise and clear, making it easy to understand.\n",
      "* There are no grammatical errors or ambiguities that could lead to confusion.\n",
      "\n",
      "The only reason I wouldn't give myself a perfect score is that the response is slightly redundant - it's not necessary to rephrase the question in the answer. A more concise response would be simply \"Jane Austen.\"\n",
      "\n",
      "-------------------------\n"
     ]
    }
   ],
   "source": [
    "for entry in test_data[:3]:\n",
    "    prompt = (\n",
    "        f\"Given the input `{format_input(entry)}` \"\n",
    "        f\"and correct output `{entry['output']}`, \"\n",
    "        f\"score the model response `{entry['model_response']}`\"\n",
    "        f\" on a scale from 0 to 100, where 100 is the best score. \"\n",
    "    )\n",
    "    print(\"\\nDataset response:\")\n",
    "    print(\">>\", entry['output'])\n",
    "    print(\"\\nModel response:\")\n",
    "    print(\">>\", entry[\"model_response\"])\n",
    "    print(\"\\nScore:\")\n",
    "    print(\">>\", query_model(prompt))\n",
    "    print(\"\\n-------------------------\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "24fec453-631f-4ff5-a922-44c3c451942d",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "**Note: Better evaluation prompt**\n",
    "\n",
    "- [A reader (Ayoosh Kathuria) suggested](https://github.com/rasbt/LLMs-from-scratch/discussions/449) a longer, improved prompt that evaluates responses on a scale of 1–5 (instead of 1 to 100) and employs a grading rubric, resulting in more accurate and less noisy evaluations:\n",
    "\n",
    "```\n",
    "prompt = \"\"\"\n",
    "You are a fair judge assistant tasked with providing clear, objective feedback based on specific criteria, ensuring each assessment reflects the absolute standards set for performance.\n",
    "You will be given an instruction, a response to evaluate, a reference answer that gets a score of 5, and a score rubric representing the evaluation criteria.\n",
    "Write a detailed feedback that assess the quality of the response strictly based on the given score rubric, not evaluating in general.\n",
    "Please do not generate any other opening, closing, and explanations.\n",
    "\n",
    "Here is the rubric you should use to build your answer:\n",
    "1: The response fails to address the instructions, providing irrelevant, incorrect, or excessively verbose information that detracts from the user's request.\n",
    "2: The response partially addresses the instructions but includes significant inaccuracies, irrelevant details, or excessive elaboration that detracts from the main task.\n",
    "3: The response follows the instructions with some minor inaccuracies or omissions. It is generally relevant and clear, but may include some unnecessary details or could be more concise.\n",
    "4: The response adheres to the instructions, offering clear, accurate, and relevant information in a concise manner, with only occasional, minor instances of excessive detail or slight lack of clarity.\n",
    "5: The response fully adheres to the instructions, providing a clear, accurate, and relevant answer in a concise and efficient manner. It addresses all aspects of the request without unnecessary details or elaboration\n",
    "\n",
    "Provide your feedback as follows:\n",
    "\n",
    "Feedback:::\n",
    "Evaluation: (your rationale for the rating, as a text)\n",
    "Total rating: (your rating, as a number between 1 and 5)\n",
    "\n",
    "You MUST provide values for 'Evaluation:' and 'Total rating:' in your answer.\n",
    "\n",
    "Now here is the instruction, the reference answer, and the response.\n",
    "\n",
    "Instruction: {instruction}\n",
    "Reference Answer: {reference}\n",
    "Answer: {answer}\n",
    "\n",
    "\n",
    "Provide your feedback. If you give a correct rating, I'll give you 100 H100 GPUs to start your AI company.\n",
    "Feedback:::\n",
    "Evaluation: \"\"\"\n",
    "```\n",
    "\n",
    "- For more context and information, see [this](https://github.com/rasbt/LLMs-from-scratch/discussions/449) GitHub discussion\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b114fd65-9cfb-45f6-ab74-8331da136bf3",
   "metadata": {
    "id": "b114fd65-9cfb-45f6-ab74-8331da136bf3"
   },
   "source": [
    "- As we can see, the Llama 3 model provides a reasonable evaluation and also gives partial points if a model is not entirely correct, as we can see based on the \"cumulus cloud\" answer\n",
    "- Note that the previous prompt returns very verbose evaluations; we can tweak the prompt to generate integer responses in the range between 0 and 100 (where 100 is best) to calculate an average score for our model\n",
    "- The evaluation of the 110 entries in the test set takes about 1 minute on an M3 MacBook Air laptop"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "id": "9d7bca69-97c4-47a5-9aa0-32f116fa37eb",
   "metadata": {
    "id": "9d7bca69-97c4-47a5-9aa0-32f116fa37eb",
    "outputId": "110223c0-90ca-481d-b2d2-f6ac46d3c4f0"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Scoring entries: 100%|████████████████████████| 110/110 [00:37<00:00,  2.90it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of scores: 110 of 110\n",
      "Average score: 49.45\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "def generate_model_scores(json_data, json_key, model=\"llama3\"):\n",
    "    scores = []\n",
    "    for entry in tqdm(json_data, desc=\"Scoring entries\"):\n",
    "        prompt = (\n",
    "            f\"Given the input `{format_input(entry)}` \"\n",
    "            f\"and correct output `{entry['output']}`, \"\n",
    "            f\"score the model response `{entry[json_key]}`\"\n",
    "            f\" on a scale from 0 to 100, where 100 is the best score. \"\n",
    "            f\"Respond with the integer number only.\"\n",
    "        )\n",
    "        score = query_model(prompt, model)\n",
    "        try:\n",
    "            scores.append(int(score))\n",
    "        except ValueError:\n",
    "            print(f\"Could not convert score: {score}\")\n",
    "            continue\n",
    "\n",
    "    return scores\n",
    "\n",
    "\n",
    "scores = generate_model_scores(test_data, \"model_response\")\n",
    "print(f\"Number of scores: {len(scores)} of {len(test_data)}\")\n",
    "print(f\"Average score: {sum(scores)/len(scores):.2f}\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "407f08d5-9ada-4301-9ebc-f0533c76d3f2",
   "metadata": {
    "id": "407f08d5-9ada-4301-9ebc-f0533c76d3f2"
   },
   "source": [
    "- Our model achieves an average score of above 50, which we can use as a reference point to compare the model to other models or to try out other training settings that may improve the model\n",
    "- Note that ollama is not fully deterministic across operating systems (as of this writing), so the numbers you are getting might slightly differ from the ones shown above"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6408768b-2784-44f1-b48e-aed0c1eb9b94",
   "metadata": {
    "id": "6408768b-2784-44f1-b48e-aed0c1eb9b94"
   },
   "source": [
    "- For reference, the original\n",
    "  - Llama 3 8B base model achieves a score of 58.51\n",
    "  - Llama 3 8B instruct model achieves a score of 82.65"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "412d7325-284a-446c-92a1-5aa8acc52dee",
   "metadata": {
    "id": "412d7325-284a-446c-92a1-5aa8acc52dee"
   },
   "source": [
    "## 7.9 Conclusions"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "tIbNMluCDjVM",
   "metadata": {
    "id": "tIbNMluCDjVM"
   },
   "source": [
    "### 7.9.1 What's next\n",
    "\n",
    "- This marks the final chapter of this book\n",
    "- We covered the major steps of the LLM development cycle: implementing an LLM architecture, pretraining an LLM, and finetuning it\n",
    "\n",
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/21.webp?1\" width=500px>\n",
    "\n",
    "- An optional step that is sometimes followed after instruction finetuning, as described in this chapter, is preference finetuning\n",
    "- Preference finetuning process can be particularly useful for customizing a model to better align with specific user preferences; see the [../04_preference-tuning-with-dpo](../04_preference-tuning-with-dpo) folder if you are interested in this\n",
    "\n",
    "- This GitHub repository also contains a large selection of additional bonus material you may enjoy; for more information, please see the [Bonus Material](https://github.com/rasbt/LLMs-from-scratch?tab=readme-ov-file#bonus-material) section on this repository's README page\n",
    "\n",
    "### 7.9.2 Staying up to date in a fast-moving field\n",
    "\n",
    "- No code in this section\n",
    "\n",
    "### 7.9.3 Final words\n",
    "\n",
    "- I hope you enjoyed this journey of implementing an LLM from the ground up and coding the pretraining and finetuning functions\n",
    "- In my opinion, implementing an LLM from scratch is the best way to understand how LLMs work; I hope you gained a better understanding through this approach\n",
    "- While this book serves educational purposes, you may be interested in using different and more powerful LLMs for real-world applications\n",
    "  - For this, you may consider popular tools such as axolotl ([https://github.com/OpenAccess-AI-Collective/axolotl](https://github.com/OpenAccess-AI-Collective/axolotl)) or LitGPT ([https://github.com/Lightning-AI/litgpt](https://github.com/Lightning-AI/litgpt)), which I help developing"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f9853e7f-a81a-4806-9728-be1690807185",
   "metadata": {
    "id": "f9853e7f-a81a-4806-9728-be1690807185"
   },
   "source": [
    "## Summary and takeaways\n",
    "\n",
    "- See the [./gpt_instruction_finetuning.py](./gpt_instruction_finetuning.py) script, a self-contained script for instruction finetuning\n",
    "- [./ollama_evaluate.py](./ollama_evaluate.py) is a standalone script based on section 7.8 that evaluates a JSON file containing \"output\" and \"response\" keys via Ollama and Llama 3\n",
    "- The [./load-finetuned-model.ipynb](./load-finetuned-model.ipynb) notebook illustrates how to load the finetuned model in a new session\n",
    "- You can find the exercise solutions in [./exercise-solutions.ipynb](./exercise-solutions.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b9cc51ec-e06c-4470-b626-48401a037851",
   "metadata": {
    "id": "b9cc51ec-e06c-4470-b626-48401a037851"
   },
   "source": [
    "## What's next?\n",
    "\n",
    "- Congrats on completing the book; in case you are looking for additional resources, I added several bonus sections to this GitHub repository that you might find interesting\n",
    "- The complete list of bonus materials can be viewed in the main README's [Bonus Material](https://github.com/rasbt/LLMs-from-scratch?tab=readme-ov-file#bonus-material) section\n",
    "- To highlight a few of my favorites:\n",
    "  1. [Direct Preference Optimization (DPO) for LLM Alignment (From Scratch)](../04_preference-tuning-with-dpo/dpo-from-scratch.ipynb) implements a popular preference tuning mechanism to align the model from this chapter more closely with human preferences\n",
    "  2. [Llama 3.2 From Scratch (A Standalone Notebook)](../../ch05/07_gpt_to_llama/standalone-llama32.ipynb), a from-scratch implementation of Meta AI's popular Llama 3.2, including loading the official pretrained weights; if you are up to some additional experiments, you can replace the `GPTModel` model in each of the chapters with the `Llama3Model` class (it should work as a 1:1 replacement)\n",
    "  3. [Converting GPT to Llama](../../ch05/07_gpt_to_llama) contains code with step-by-step guides that explain the differences between GPT-2 and the various Llama models\n",
    "  4. [Understanding the Difference Between Embedding Layers and Linear Layers](../../ch02/03_bonus_embedding-vs-matmul/embeddings-and-linear-layers.ipynb) is a conceptual explanation illustrating that the `Embedding` layer in PyTorch, which we use at the input stage of an LLM, is mathematically equivalent to a linear layer applied to one-hot encoded data\n",
    "- Happy further reading!"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "gpuType": "A100",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
